Business Intelligence (BI) plays a significant role in businesses nowadays. Moreover, the databases that deal with the queries related to BI are presently facing an increase in workload. At present, when queries are sent to very large databases, millions of rows are returned. Also the users have to go through extended query response times when joining multiple tables are involved with such queries. ‘Star Join Query Optimization’ is a new feature of SQL Server 2008 Enterprise Edition. This mechanism uses bitmap filtering for improving the performance of some types of queries by the effective retrieval of rows from fact tables.
Improved Query Response Times
In general, data warehouses employ dimensionally modeled star or snowflake schemas. These schemas have one or more than one fact tables that contain transactional data and many dimension tables, which holds information such as product data, customer information, and times and dates – all these define the fact table data. Usually, foreign keys are employed for maintaining relationships between the rows in fact tables and also between the rows in the dimension tables. Databases that contain star schemas are recognized by SQL Server 2008 Enterprise. It uses the new Star Join Query logic for processing queries against such star schemas more efficiently. Typically, on an average, data warehouse queries run faster to approximately 20 percent.
Star Join Query Optimization is automatically implemented by the SQL Server. It does not require a special database or application configuration. The query processor will usually optimize queries with medium selectivity (this refers to the queries that retrieve approximately 10% to 75% of rows from a fact table). Such queries are usually handled using hash joins to join the dimension and fact tables by employing the foreign keys to identify the matching rows. A hash table is built for each dimension table referenced in the query in the case of hash joins; the optimization process uses these hash tables for deriving bitmap filters. The key values from each dimension table are identified by bitmap filters; these key values qualify for inclusion in the query. When the fact table is scanned, the bitmap filters are applied to it. These bitmap filters eliminate those rows of the fact table which are not qualified for inclusion in the result set. The most selective bitmap filter is applied first as it is found to eliminate the highest number of rows. Since the eliminated rows do not need further processing, the subsequent filters need not be applied to them – this way the process becomes more efficient.
The performance is enhanced in the case of medium selectivity queries while using bitmap filtering because the rows are filtered before any joins are implemented. Hence, there is a decrease in the number of rows that are processed by each join. Bitmap filtering is not applied when queries are highly selective (i.e., those queries that return less than 10% of the rows in a fact table). In such case, a nested loop join is found to be generally more efficient. Similarly, when the queries are not very selective at all (queries which return more than 75% of the rows in a fact table), bitmap filtering is not applied as there are very few rows to be filtered, and hence, there is no requirement of enhancement in performance in this case.
Integer Data Types
Star join optimization is found to give the highest efficiency when the data type of the columns used in the joins is integer. This feature enables the bitmap filter to be applied as part of the initial table or index scan rather than being used at a later stage in the query plan. Most of the queries are benefited from star join optimization since foreign key relationships are commonly implemented using integer-based alternate key columns.
Reference : Pinal Dave (http://blog.SQLAuthority.com)