BIG Data – such a big word – everybody talks about this now a days. It is the word in the database world. In one of the conversation I asked my friend Jasjeet Sigh the same question – what is Big Data? He instantly came up with a very effective write-up. Jasjeet is working as a Technical Manager with Koenig Solutions. He leads the SQL domain, and holds rich IT industry experience. Talking about Koenig, it is a 19 year old IT training company that offers several certification choices. Some of its courses include SharePoint Training, Project Management certifications, Microsoft Trainings, Business Intelligence programs, Web Design and Development courses etc.
Big Data, as the name suggests, is about data that is BIG in nature. The data is BIG in terms of size, and it is difficult to manage such enormous data with relational database management systems that are quite popular these days.
Big Data is not just about being large in size, it is also about the variety of the data that differs in form or type. Some examples of Big Data are given below :
- Scientific data related to weather and atmosphere, Genetics etc
- Data collected by various medical procedures, such as Radiology, CT scan, MRI etc
- Data related to Global Positioning System
- Pictures and Videos
- Radio Frequency Data
- Data that may vary very rapidly like stock exchange information
Apart from difficulties in managing and storing such data, it is difficult to query, analyze and visualize it.
The characteristics of Big Data can be defined by four Vs:
- Volume: It simply means a large volume of data that may span Petabyte, Exabyte and so on. However it also depends organization to organization that what volume of data they consider as Big Data.
- Variety: As discussed above, Big Data is not limited to relational information or structured Data. It can also include unstructured data like pictures, videos, text, audio etc.
- Velocity: Velocity means the speed by which data changes. The higher is the velocity, the more efficient should be the system to capture and analyze the data. Missing any important point may lead to wrong analysis or may even result in loss.
- Veracity: It has been recently added as the fourth V, and generally means truthfulness or adherence to the truth. In terms of Big Data, it is more of a challenge than a characteristic. It is difficult to ascertain the truth out of the enormous amount of data and the one that has high velocity. There are always chances of having un-precise and uncertain data. It is a challenging task to clean such data before it is analyzed.
Big Data can be considered as the next big thing in the IT sector in terms of innovation and development. If appropriate technologies are developed to analyze and use the information, it can be the driving force for almost all industrial segments. These include Retail, Manufacturing, Service, Finance, Healthcare etc. This will help them to automate business decisions, increase productivity, and innovate and develop new products.
Thanks Jasjeet Singh for an excellent write up. Jasjeet Sign is working as a Technical Manager with Koenig Solutions.
Reference: Pinal Dave (http://blog.SQLAuthority.com)