Big Data – Role of Cloud Computing in Big Data – Day 11 of 21

October 15, 2013

In yesterday’s blog post we learned the importance of the NewSQL. In this article we will understand the role of Cloud Computing in Big Data Story

What is Cloud?

Cloud is the biggest buzzword around from last few years. Everyone knows about the Cloud and it is extremely well defined online. In this article we will discuss cloud in the context of the Big Data. Cloud computing is a method of providing a shared computing resources to the application which requires dynamic resources. These resources include applications, computing, storage, networking, development and various deployment platforms. The fundamentals of the cloud computing are that it shares pretty much share all the resources and deliver to end users as a service.

Examples of the Cloud Computing and Big Data are Google and Amazon.com. Both have fantastic Big Data offering with the help of the cloud. We will discuss this later in this blog post.

There are two different Cloud Deployment Models: 1) The Public Cloud and 2) The Private Cloud

Public Cloud Computing

Public Cloud is the cloud infrastructure build by commercial providers (Amazon, Rackspace, etc.) creates a highly scalable data center that hides the complex infrastructure from the consumer and provides various services.

Private Cloud Computing

Private Cloud is the cloud infrastructure build by a single organization where they are managing highly scalable data center internally.

Here is the quick comparison between Public Cloud and Private Cloud from Wikipedia:

	Public Cloud	Private Cloud
Initial cost	Typically zero	Typically high
Running cost	Unpredictable	Unpredictable
Customization	Impossible	Possible
Privacy	No (Host has access to the data	Yes
Single sign-on	Impossible	Possible
Scaling up	Easy while within defined limits	Laborious but no limits

Hybrid Cloud

Hybrid Cloud is the cloud infrastructure build with the composition of two or more clouds like public and private cloud. Hybrid cloud gives best of the both the world as it combines multiple cloud deployment models together.

Big Data - Role of Cloud Computing in Big Data - Day 11 of 21 bigdata-800x807

Cloud and Big Data – Common Characteristics

There are many characteristics of the Cloud Architecture and Cloud Computing which are also essentially important for Big Data as well. They highly overlap and at many places it just makes sense to use the power of both the architecture and build a highly scalable framework.

Here is the list of all the characteristics of cloud computing important in Big Data

Scalability
Elasticity
Ad-hoc Resource Pooling
Low Cost to Setup Infastructure
Pay on Use or Pay as you Go
Highly Available

Leading Big Data Cloud Providers

There are many players in Big Data Cloud but we will list a few of the known players in this list.

Amazon

Amazon is arguably the most popular Infrastructure as a Service (IaaS) provider. The history of how Amazon started in this business is very interesting. They started out with a massive infrastructure to support their own business. Gradually they figured out that their own resources are underutilized most of the time. They decided to get the maximum out of the resources they have and hence they launched their Amazon Elastic Compute Cloud (Amazon EC2) service in 2006. Their products have evolved a lot recently and now it is one of their primary business besides their retail selling.

Amazon also offers Big Data services understand Amazon Web Services. Here is the list of the included services:

Amazon Elastic MapReduce – It processes very high volumes of data
Amazon DynammoDB – It is fully managed NoSQL (Not Only SQL) database service
Amazon Simple Storage Services (S3) – A web-scale service designed to store and accommodate any amount of data
Amazon High Performance Computing – It provides low-tenancy tuned high performance computing cluster
Amazon RedShift – It is petabyte scale data warehousing service

Google

Though Google is known for Search Engine, we all know that it is much more than that.

Google Compute Engine – It offers secure, flexible computing from energy efficient data centers
Google Big Query – It allows SQL-like queries to run against large datasets
Google Prediction API – It is a cloud based machine learning tool

Other Players

Besides Amazon and Google we also have other players in the Big Data market as well. Microsoft is also attempting Big Data with the Cloud with Microsoft Azure. Additionally Rackspace and NASA together have initiated OpenStack. The goal of Openstack is to provide a massively scaled, multitenant cloud that can run on any hardware.

Thing to Watch

The cloud based solutions provides a great integration with the Big Data’s story as well it is very economical to implement as well. However, there are few things one should be very careful when deploying Big Data on cloud solutions. Here is a list of a few things to watch:

Data Integrity
Initial Cost
Recurring Cost
Performance
Data Access Security
Location
Compliance

Every company have different approaches to Big Data and have different rules and regulations. Based on various factors, one can implement their own custom Big Data solution on a cloud.

Tomorrow

In tomorrow’s blog post we will discuss about various Operational Databases supporting Big Data.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Cloud Computing, SQL Server

Big Data – Buzz Words: What is NewSQL – Day 10 of 21

Big Data – Operational Databases Supporting Big Data – RDBMS and NoSQL – Day 12 of 21

2 Comments. Leave new

Susan Bilder
October 29, 2013 12:57 am
We all understand the future is going towards cloud computing and using other companies to host their data and applications. One of the concerns is the performance of these applications in the cloud. Since your application is on a shared resource you may not know how fast your application will run until it is fully implemented. Do you truly know how much bandwidth is available to your application? Hopefully with time better metrics will be provided.
Reply

Jun 2026 Discount: Comprehensive Database Performance Health Check | Testimonials