Big Data – Basics of Big Data Architecture – Day 4 of 21

In yesterday’s blog post we understood how Big Data evolution happened. Today we will understand basics of the Big Data Architecture.

Big Data Cycle

Just like every other database related applications, bit data project have its development cycle. Though three Vs (link) for sure plays an important role in deciding the architecture of the Big Data projects. Just like every other project Big Data project also goes to similar phases of the data capturing, transforming, integrating, analyzing and building actionable reporting on the top of  the data.

While the process looks almost same but due to the nature of the data the architecture is often totally different. Here are few of the question which everyone should ask before going ahead with Big Data architecture.

Questions to Ask

How big is your total database?

What is your requirement of the reporting in terms of time – real time, semi real time or at frequent interval?

How important is the data availability and what is the plan for disaster recovery?

What are the plans for network and physical security of the data?

What platform will be the driving force behind data and what are different service level agreements for the infrastructure?

This are just basic questions but based on your application and business need you should come up with the custom list of the question to ask. As I mentioned earlier this question may look quite simple but the answer will not be simple. When we are talking about Big Data implementation there are many other important aspects which we have to consider when we decide to go for the architecture.

Building Blocks of Big Data Architecture

It is absolutely impossible to discuss and nail down the most optimal architecture for any Big Data Solution in a single blog post, however, we can discuss the basic building blocks of big data architecture. Here is the image which I have built to explain how the building blocks of the Big Data architecture works.

Big Data - Basics of Big Data Architecture - Day 4 of 21 bigdataarchitecture

Above image gives good overview of how in Big Data Architecture various components are associated with each other. In Big Data various different data sources are part of the architecture hence extract, transform and integration are one of the most essential layers of the architecture. Most of the data is stored in relational as well as non relational data marts and data warehousing solutions. As per the business need various data are processed as well converted to proper reports and visualizations for end users. Just like software the hardware is almost the most important part of the Big Data Architecture. In the big data architecture hardware infrastructure is extremely important and failure over instances as well as redundant physical infrastructure is usually implemented.

NoSQL in Data Management

NoSQL is a very famous buzz word and it really means Not Relational SQL or Not Only SQL. This is because in Big Data Architecture the data is in any format. It can be unstructured, relational or in any other format or from any other data source. To bring all the data together relational technology is not enough, hence new tools, architecture and other algorithms are invented which takes care of all the kind of data. This is collectively called NoSQL.


Next four days we will answer the Buzz Words – Hadoop.

Reference: Pinal Dave (

Previous Post
Big Data – Evolution of Big Data – Day 3 of 21
Next Post
SQL SERVER – Weekly Series – Memory Lane – #049

Related Posts

No results found.

4 Comments. Leave new

  • Thanks Pinal for your wonderful post

  • Haresh Ambaliya
    October 4, 2013 11:31 pm

    Hi Pinal,
    There are lots of NoSQL in internet, and many of them is open source (link coatchdb, ravandb and of course your favorite NuoDB). So, I am confuse about How to find which one is suitable for my application? Is there any criteria to find best match?

  • Imran Mohammed
    October 5, 2013 9:38 pm


    When it comes to reality, when you plan to change technology, the first question asked is (missing in your list),

    1. How much would this change cost to organization?

  • Jithesh Krishnan
    June 16, 2016 2:09 am

    Typo- Para “Big Data Cycle”
    “bit data project have its development cycle” to “big data project have its development cycle”


Leave a Reply