Big Data – Beginning Big Data – Day 1 of 21

What is Big Data?

I want to learn Big Data. I have no clue where and how to start learning about it.

Does Big Data really means data is big?

What are the tools and software I need to know to learn Big Data?

I often receive questions which I mentioned above. They are good questions and honestly when we search online, it is hard to find authoritative and authentic answers. I have been working with Big Data and NoSQL for a while and I have decided that I will attempt to discuss this subject over here in the blog.

In the next 21 days we will understand what is so big about Big Data.

Big Data – Big Thing!

Big Data is becoming one of the most talked about technology trends nowadays. The real challenge with the big organization is to get maximum out of the data already available and predict what kind of data to collect in the future. How to take the existing data and make it meaningful that it provides us accurate insight in the past data is one of the key discussion points in many of the executive meetings in organizations. With the explosion of the data the challenge has gone to the next level and now a Big Data is becoming the reality in many organizations.

Big Data – A Rubik’s Cube

rubiks cube Big Data   Beginning Big Data   Day 1 of 21I like to compare big data with the Rubik’s cube. I believe they have many similarities. Just like a Rubik’s cube it has many different solutions. Let us visualize a Rubik’s cube solving challenge where there are many experts participating. If you take five Rubik’s cube and mix up the same way and give it to five different expert to solve it. It is quite possible that all the five people will solve the Rubik’s cube in fractions of the seconds but if you pay attention to the same closely, you will notice that even though the final outcome is the same, the route taken to solve the Rubik’s cube is not the same. Every expert will start at a different place and will try to resolve it with different methods. Some will solve one color first and others will solve another color first. Even though they follow the same kind of algorithm to solve the puzzle they will start and end at a different place and their moves will be different at many occasions. It is  nearly impossible to have a exact same route taken by two experts.

Big Market and Multiple Solutions

Big Data is exactly like a Rubik’s cube – even though the goal of every organization and expert is same to get maximum out of the data, the route and the starting point are different for each organization and expert. As organizations are evaluating and architecting big data solutions they are also learning the ways and opportunities which are related to Big Data. There is not a single solution to big data as well there is not a single vendor which can claim to know all about Big Data. Honestly, Big Data is too big a concept and there are many players – different architectures, different vendors and different technology.

What is Next?

In this 31 days series we will be exploring many essential topics related to big data. I do not claim that you will be master of the subject after 31 days but I claim that I will be covering following topics in easy to understand language.

  • Architecture of Big Data
  • Big Data a Management and Implementation
  • Different Technologies – Hadoop, Mapreduce
  • Real World Conversations
  • Best Practices


In tomorrow’s blog post we will try to answer one of the very essential questions – What is Big Data?

Reference: Pinal Dave (

29 thoughts on “Big Data – Beginning Big Data – Day 1 of 21

  1. I cant wait to hear all about it, I have heard so many different things and your rubrics cube analogy seems to hit the nail on the head.


  2. It is not so much about data, but what you do with the data. To me “Big Data” is just a term to describe ALL of the data. It could be in the data warehouse, CRM database, text files, web logs, the list goes on.

    How we use the data to make better decisions, improve processes and reduce risks is what makes “Big Data” valuable.

    Just like the various definitions for Big Data there are numerous tools to process that data. A Hadoop cluster won’t work to alert me of a potential server failure, but it can crunch millions of rows from various sources very efficiently to identify a fraudulent claim. A Python script can alert me to a potential server failure, but would not be efficient at identifying a fraudulent claim.

    Looking forward to following along on this journey as Big Data progresses into better decisions.

    Michael Heindel


  3. Looking forward to this series. Thanks Pinal, appreciate your time like always.

    I know its too early to ask this question, but whats your recommendation as to when an organization should decide they should move to NOSQL from RDBMS, like when data size reaches 500 GB/ 1 TB ?


  4. Thanks Pinal..I searched about big data a lot but did not get any good article on that..hope this 31 days series worked out… Please incude some real world examples as well. Thanks.


  5. Pingback: Interview Question of the Week #022 – How to Get Started with Big Data? | Journey to SQL Authority with Pinal Dave

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s