SQL SERVER – 2008 Star Join Query Optimization

Business Intelligence (BI) plays a significant role in businesses nowadays. Moreover, the databases that deal with the queries related to BI are presently facing an increase in workload. At present, when queries are sent to very large databases, millions of rows are returned. Also the users have to go through extended query response times when joining multiple tables are involved with such queries. ‘Star Join Query Optimization’ is a new feature of SQL Server 2008 Enterprise Edition. This mechanism uses bitmap filtering for improving the performance of some types of queries by the effective retrieval of rows from fact tables.

Improved Query Response Times
In general, data warehouses employ dimensionally modeled star or snowflake schemas. These schemas have one or more than one fact tables that contain transactional data and many dimension tables, which holds information such as product data, customer information, and times and dates – all these define the fact table data. Usually, foreign keys are employed for maintaining relationships between the rows in fact tables and also between the rows in the dimension tables. Databases that contain star schemas are recognized by SQL Server 2008 Enterprise. It uses the new Star Join Query logic for processing queries against such star schemas more efficiently. Typically, on an average, data warehouse queries run faster to approximately 20 percent.

Automatically Implemented
Star Join Query Optimization is automatically implemented by the SQL Server. It does not require a special database or application configuration. The query processor will usually optimize queries with medium selectivity (this refers to the queries that retrieve approximately 10% to 75% of rows from a fact table). Such queries are usually handled using hash joins to join the dimension and fact tables by employing the foreign keys to identify the matching rows. A hash table is built for each dimension table referenced in the query in the case of hash joins; the optimization process uses these hash tables for deriving bitmap filters. The key values from each dimension table are identified by bitmap filters; these key values qualify for inclusion in the query. When the fact table is scanned, the bitmap filters are applied to it. These bitmap filters eliminate those rows of the fact table which are not qualified for inclusion in the result set. The most selective bitmap filter is applied first as it is found to eliminate the highest number of rows. Since the eliminated rows do not need further processing, the subsequent filters need not be applied to them – this way the process becomes more efficient.

Query Selectivity
The performance is enhanced in the case of medium selectivity queries while using bitmap filtering because the rows are filtered before any joins are implemented. Hence, there is a decrease in the number of rows that are processed by each join. Bitmap filtering is not applied when queries are highly selective (i.e., those queries that return less than 10% of the rows in a fact table). In such case, a nested loop join is found to be generally more efficient. Similarly, when the queries are not very selective at all (queries which return more than 75% of the rows in a fact table), bitmap filtering is not applied as there are very few rows to be filtered, and hence, there is no requirement of enhancement in performance in this case.

Integer Data Types
Star join optimization is found to give the highest efficiency when the data type of the columns used in the joins is integer. This feature enables the bitmap filter to be applied as part of the initial table or index scan rather than being used at a later stage in the query plan. Most of the queries are benefited from star join optimization since foreign key relationships are commonly implemented using integer-based alternate key columns.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

About these ads

5 thoughts on “SQL SERVER – 2008 Star Join Query Optimization

  1. Thank you for this informative article, I had not read about this before but this will make a strong case for using a star schema when it fits the DB utilization.

    Like

  2. Please optimize it
    ****************************
    SET statistics io ON;
    SET statistics TIME ON;

    SELECT itm.userreviewid AS itemid,
    itm.reviewitem AS title,
    itm.imgpath AS imgpath,
    ‘Barkspider’ AS [Source],
    reviews.numratings,
    reviews.avgrating,
    5 AS ratingmax,
    ” AS itempath,
    ” AS identifier,
    ‘barspider-user-review.png’ AS logo,
    Isnull(0, 0) AS fromprice,
    Isnull(0, 0) AS toprice,
    reviews.totrating,
    Newid() AS ‘NewID’,
    ‘a’ AS bssort
    FROM barkspiderblog.dbo.userreviews itm WITH (nolock)
    JOIN (SELECT COUNT(1) AS numratings,
    AVG(rating) AS avgrating,
    SUM(rating) AS totrating,
    reviewitem
    FROM barkspiderblog.dbo.userreviews WITH (nolock)
    GROUP BY reviewitem) reviews
    ON Lower(itm.reviewitem) = Lower(reviews.reviewitem)
    WHERE ( ( ( itm.reviewitemid IS NULL
    OR itm.reviewitemid = 0 )
    AND itm.parentid IS NOT NULL )
    OR ( itm.reviewitemid IS NOT NULL
    AND itm.reviewitemid = itm.userreviewid ) )
    AND itm.reviewitem LIKE ‘%tv%’
    AND itm.userreviewid IN (SELECT userreviewid
    FROM (SELECT Rank() OVER (PARTITION BY
    reviewitem
    ORDER
    BY createdon ASC) AS
    r1,
    userreviewid
    FROM barkspiderblog.dbo.userreviews
    WHERE reviewitem LIKE ‘%tv%’) tab
    WHERE tab.r1 = 1)
    UNION ALL
    SELECT itm.itemid,
    itm.title,
    itm.imgpath,
    src.DESCRIPTION AS [Source],
    reviews.numratings,
    reviews.avgrating,
    src.ratingmax,
    itm.itempath,
    itm.identifier,
    src.logo,
    fpr.fromprice,
    fpr.toprice,
    reviews.totrating,
    Newid() AS ‘NewID’,
    ‘b’ AS bssort
    FROM items itm WITH (nolock)
    JOIN sources src WITH (nolock)
    ON itm.sourceid = src.sourceid
    JOIN (SELECT COUNT(1) AS numratings,
    AVG(rating) AS avgrating,
    SUM(rating) AS totrating,
    itemid
    FROM reviews WITH (nolock)
    GROUP BY itemid) reviews
    ON itm.itemid = reviews.itemid
    LEFT OUTER JOIN fetchitemprice fpr WITH (nolock)
    ON itm.itemid = fpr.itemid
    WHERE title LIKE ‘%tv%’
    UNION ALL
    SELECT itm.itemid,
    itm.title,
    itm.imgpath,
    ‘Barkspider’ AS [Source],
    reviews.numratings,
    reviews.avgrating,
    5 AS ratingmax,
    itm.itempath,
    itm. identifier,
    ‘barspider-user-review.png’ AS logo,
    0 AS fromprice,
    0 AS toprice,
    reviews.totrating,
    Newid() AS newid,
    ‘a’ AS bssort
    FROM items itm WITH (nolock)
    JOIN sources src WITH (nolock)
    ON itm.sourceid = src.sourceid
    JOIN (SELECT COUNT(1) AS numratings,
    AVG(rating) AS avgrating,
    SUM(rating) AS totrating,
    reviewitemid,
    reviewitem
    FROM barkspiderblog.dbo.userreviews WITH (nolock)
    GROUP BY reviewitem,
    reviewitemid) reviews
    ON itm.itemid = reviews.reviewitemid
    LEFT OUTER JOIN fetchitemprice fpr WITH (nolock)
    ON itm.itemid = fpr.itemid
    WHERE itm.itemid IN (SELECT ur1.reviewitemid
    FROM barkspiderblog.dbo.userreviews ur1 WITH (nolock)
    WHERE ur1.parentid IS NULL)
    AND title LIKE ‘%tv%’
    ORDER BY bssort,
    Newid()

    Like

  3. Pingback: SQL SERVER – Weekly Series – Memory Lane – #006 « SQL Server Journey with SQL Authority

  4. Hi Pinal,
    Thanks for such a great article,

    I have a small doubt in this, as per the star join startagy it will create hash tables. If our fact table is very huge and our temp DB can’t take up this amount of data into hash table how does it manage the query.

    Like

  5. some quries take too much time to be executed at my distributed appliction database so I got technique is
    1.Make script of entire tables and sps
    2.Delete then re-create database
    3-Execute script throug dos command
    this works but after few weeks problem come again, I am stuck.What should I do?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s