According to MacLennan et al. (2009), data mining is defined as “the process of analyzing data to find hidden patterns using automatic methodologies.” Consider the following simple example that explains this concept. By analyzing the data on the items purchased from a supermarket or a chain of such stores, information on the products that are sold most can be obtained and accordingly supply of that particular products are increased and vice versa. Data mining, in short, is an analytical activity that studies the hidden patterns in a huge pile of data after appropriately classifying and sorting it.
Who all are involved in Data Mining?
Data mining is an activity, which can be programmed, that involves the analysis of data and finally revealing the hidden patterns. Architects, Developers and Analysts are involved in the data mining process.
Data mining is usually carried out by an Analyst. However, it is not necessary that every time, he/she will be able to identify all the hidden patterns of a particular data set, irrespective of its size. Finally these identified patterns are converted into useful information for business purpose. A Developer combines data mining with application solutions, and an Architect understands the needs of the developer and the analyst and meets them accordingly.
Microsoft and Data Mining
Microsoft provides a wide range of data mining options, which includes collaborative solutions and ad hoc analysis (in MS Office Excel). A free plug-in is available in MS Office Excel 2007, which helps the analyst to analyze the data patterns. In addition to this plug-in, the Business Intelligence Development Studio (BIDS) that is free with the SQL Server can also be used for data mining purpose.
It should be noted that data mining is not done on the basis of any known data patterns or any other additional information. The results obtained of data mining are generated from the data presented and not from any other resources. Microsoft data mining applies mathematical techniques on the available data set to obtain models. In addition to BDIS, the .NET framework and Data Mining extensions (DMX) language is also provided by Microsoft for custom solutions. At times, data mining is also known as machine learning.
Results of Data Mining – Data Mining Models
Microsoft data mining results in data mining models, which are statistical information – either predictive or descriptive. A Microsoft Mining Model consists of the following three components: metadata, which is information about the data; patterns, which are mathematical formulas or rules; and bindings, where the data is defined. The statistical results may not be understandable in relation to a business perspective. Hence, these results or models must be translated to useful business information. One who engages in data mining is responsible for creating a link or relation between the resultant data model and the respective business problem.
Role of Data Miner (or Analyst)
A data miner should undergo adequate training with regard to all the tools and technologies used in mining and should not limit himself/herself to only those tools that are required for that particular organization/business. In fact, it is the responsibility of the organization to provide training to the data mining professional on a broader perspective. Data mining is never complete without the analyst. The application of the results of data mining to a specific business significantly depends on how far the analyst has understood the industry-specific objectives.
Applications of Data Mining
Data mining is used in various applications such as forecasting business and customer trends, detecting fraud (especially in the banking sector), generating customized advertisements, grouping customers on the basis of their purchasing trends, and risk analysis.
Benefits of Microsoft Data Mining
Microsoft data mining is extensible. It can be licensed through SQL Server 2008 (or SQL Server 2005) and it is compatible with other technologies, thereby allowing access to data in different formats. Microsoft data mining can also be used for business intelligence solutions and it is scalable unlike other data mining products.
Reference : Pinal Dave (http://blog.sqlauthority.com)