Big Data – Is Big Data Relevant to me? – Big Data Questionnaires – Guest Post by Vinod Kumar

This guest post is by Vinod Kumar. Vinod Kumar has worked with SQL Server extensively since joining the industry over a decade ago. Working on various versions of SQL Server 7.0, Oracle 7.3 and other database technologies – he now works with the Microsoft Technology Center (MTC) as a Technology Architect.

Let us read the blog post in Vinod’s own voice.


I think the series from Pinal is a good one for anyone planning to start on Big Data journey from the basics. In my daily customer interactions this buzz of “Big Data” always comes up, I react generally saying – “Sir, do you really have a ‘Big Data’ problem or do you have a big Data problem?” Generally, there is a silence in the air when I ask this question. Data is everywhere in organizations – be it big data, small data, all data and for few it is bad data which is same as no data :). Wow, don’t discount me as someone who opposes “Big Data”, I am a big supporter as much as I am a critic of the abuse of this term by the people.

In this post, I wanted to let my mind flow so that you can also think in the direction I want you to see these concepts. In any case, this is not an exhaustive dump of what is in my mind – but you will surely get the drift how I am going to question Big Data terms from customers!!!

Is Big Data Relevant to me?

Many of my customers talk to me like blank whiteboard with no idea – “why Big Data”. They want to jump into the bandwagon of technology and they want to decipher insights from their unexplored data a.k.a. unstructured data with structured data. So what are these industry scenario’s that come to mind? Here are some of them:

Financials

  • Fraud detection: Banks and Credit cards are monitoring your spending habits on real-time basis.
  • Customer Segmentation: applies in every industry from Banking to Retail to Aviation to Utility and others where they deal with end customer who consume their products and services.
  • Customer Sentiment Analysis: Responding to negative brand perception on social or amplify the positive perception.
  • Sales and Marketing Campaign: Understand the impact and get closer to customer delight.
  • Call Center Analysis: attempt to take unstructured voice recordings and analyze them for content and sentiment.

Medical

  • Reduce Re-admissions: How to build a proactive follow-up engagements with patients.
  • Patient Monitoring: How to track Inpatient, Out-Patient, Emergency Visits, Intensive Care Units etc.
  • Preventive Care: Disease identification and Risk stratification is a very crucial business function for medical.
  • Claims fraud detection: There is no precise dollars that one can put here, but this is a big thing for the medical field.

Retail

  • Customer Sentiment Analysis, Customer Care Centers, Campaign Management.
  • Supply Chain Analysis: Every sensors and RFID data can be tracked for warehouse space optimization.
  • Location based marketing: Based on where a check-in happens retail stores can be optimize their marketing.

Telecom

  • Price optimization and Plans, Finding Customer churn, Customer loyalty programs
  • Call Detail Record (CDR) Analysis, Network optimizations, User Location analysis
  • Customer Behavior Analysis

Insurance

  • Fraud Detection & Analysis, Pricing based on customer
  • Sentiment Analysis, Loyalty Management
  • Agents Analysis, Customer Value Management

This list can go on to other areas like Utility, Manufacturing, Travel, ITES etc. So as you can see, there are obviously interesting use cases for each of these industry verticals. These are just representative list.

Where to start?

A lot of times I try to quiz customers on a number of dimensions before starting a Big Data conversation.

  • Are you getting the data you need the way you want it and in a timely manner?
  • Can you get in and analyze the data you need?
  • How quickly is IT to respond to your BI Requests?
  • How easily can you get at the data that you need to run your business/department/project?
  • How are you currently measuring your business?
  • Can you get the data you need to react WITHIN THE QUARTER to impact behaviors to meet your numbers or is it always “rear-view mirror?”
  • How are you measuring:
    • The Brand
    • Customer Sentiment
    • Your Competition
    • Your Pricing
    • Your performance
    • Supply Chain Efficiencies
    • Predictive product / service positioning
    • What are your key challenges of driving collaboration across your global business?  What the challenges in innovation?
    • What challenges are you facing in getting more information out of your data?

Note: Garbage-in is Garbage-out. Hold good for all reporting / analytics requirements

Big Data POCs?

A number of customers get into the realm of setting a small team to work on Big Data – well it is a great start from an understanding point of view, but I tend to ask a number of other questions to such customers. Some of these common questions are:

  1. To what degree is your advanced analytics (natural language processing, sentiment analysis, predictive analytics and classification) paired with your Big Data’s efforts?
  2. Do you have dedicated resources exploring the possibilities of advanced analytics in Big Data for your business line?
  3. Do you plan to employ machine learning technology while doing Advanced Analytics?
  4. How is Social Media being monitored in your organization?
  5. What is your ability to scale in terms of storage and processing power?
  6. Do you have a system in place to sort incoming data in near real time by potential value, data quality, and use frequency?
  7. Do you use event-driven architecture to manage incoming data?
  8. Do you have specialized data services that can accommodate different formats, security, and the management requirements of multiple data sources?
  9. Is your organization currently using or considering in-memory analytics?
  10. To what degree are you able to correlate data from your Big Data infrastructure with that from your enterprise data warehouse?
  11. Have you extended the role of Data Stewards to include ownership of big data components?
  12. Do you prioritize data quality based on the source system (that is Facebook/Twitter data has lower quality thresholds than radio frequency identification (RFID) for a tracking system)?
  13. Do your retention policies consider the different legal responsibilities for storing Big Data for a specific amount of time?
  14. Do Data Scientists work in close collaboration with Data Stewards to ensure data quality?
  15. How is access to attributes of Big Data being given out in the organization?
  16. Are roles related to Big Data (Advanced Analyst, Data Scientist) clearly defined?
  17. How involved is risk management in the Big Data governance process?
  18. Is there a set of documented policies regarding Big Data governance?
  19. Is there an enforcement mechanism or approach to ensure that policies are followed?
  20. Who is the key sponsor for your Big Data governance program? (The CIO is best)
  21. Do you have defined policies surrounding the use of social media data for potential employees and customers, as well as the use of customer Geo-location data?
  22. How accessible are complex analytic routines to your user base?
  23. What is the level of involvement with outside vendors and third parties in regard to the planning and execution of Big Data projects?
  24. What programming technologies are utilized by your data warehouse/BI staff when working with Big Data?

These are some of the important questions I ask each customer who is actively evaluating Big Data trends for their organizations. These questions give you a sense of direction where to start, what to use, how to secure, how to analyze and more.

Sign off

Any Big data is analysis is incomplete without a compelling story. The best way to understand this is to watch Hans Rosling – Gapminder (2:17 to 6:06) videos about the third world myths. Don’t get overwhelmed with the Big Data buzz word, the destination to what your data speaks is important.

In this blog post, we did not particularly look at any Big Data technologies. This is a set of questionnaire one needs to keep in mind as they embark their journey of Big Data. I did write some of the basics in my blog: Big Data – Big Hype yet Big Opportunity. Do let me know if these questions make sense?

 Reference: Pinal Dave (http://blog.sqlauthority.com)

10 thoughts on “Big Data – Is Big Data Relevant to me? – Big Data Questionnaires – Guest Post by Vinod Kumar

  1. Hi Vinod,
    Thanks for such informative and and descriptive information of pre requisites for considering weather Big Data would be the solution. I have even gone through your ‘Big Data – Big Hype Yet Big Opportunity’ link, as mentioned in the links the data would be in range of Peta to Exabytes. My Question is how would you predict the scope of Big Data in India where Data would only in Peta or Exabytes in less number compared to Rest Of The World?? Also as it is more Costly will Companies in India would even like to opt for Big Data??

    Regards,
    Rahul

    Like

    • Rahul – Though I talk about Exabytes or Petabytes. Big Data neednot always be in that scale. Some times just analyzing unstructured data with current market data can run in few GBs and even there this is applicable.

      Now to companies in India, well that is a tough ask because many don’t know what to expect. There are a number of them already using these solutions and are successful in their own way. Telecom and Health are the sectors where this has already got adoption even before the industry could adopt.

      The idea of moving into Hadoop and other mechanisms is to take your commodity hardware and scale out your storage needs. Yes, there is a bit of cost implications but it is a slow and calculated investments companies are making.

      Like

    • Shan – The theme of the post is, Don’t try to jump into the so called “Big Data” hype. Evaluate your needs and then see if Big Data will solve your problem.

      Let us refrain from positioning “Big Data” as a solution to a problem we are searching today inside our enterprise.

      Ask some / all the questions I ask generally to customers to asses if Big Data will solve any specific problem your organization has.

      It is easy to get carried away with industry jargons but this is not be-all end-all solution to all your problems. Hope this gives more context to why this post.

      Like

  2. Amazing blog.. I believe if any organization could answer the first question “Sir, do you really have a ‘Big Data’ problem or do you have a big Data problem?”.
    I don’t see Space technology in the list. Isn’t there any business case in that area?

    Like

    • Glad I could give you a start Mukhesh !!! Thanks for your kind words. Now, I have not outlined all the businesses that can get benefit from Big Data – infact most of the industries including Space Sciences are great candidates for Big Data usage (if done right) :) …

      Like

  3. Bigdata is like teenage [] : everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it – captured from FB

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s