Today, I would like to share one very quick tip about how to remove bookmark lookup or RID lookup. Let us first understand Bookmark lookup or RID lookup. Please note that from SQL Server 2005 SP1 onwards, Bookmark look up is known as Key look up.
When a small number of rows are requested by a query, the SQL Server optimizer will try to use a non-clustered index on the column or columns contained in the WHERE clause to retrieve the data requested by the query. If the query requests data from columns not present in the non-clustered index, SQL Server must go back to the data pages to get the data in those columns. Even if the table has a clustered index or not, the query will still have to return to the table or clustered index to retrieve the data.
In the above scenario, if table has clustered index, it is called bookmark lookup (or key lookup); if the table does not have clustered index, but a non-clustered index, it is called RID lookup. This operation is very expensive. To optimize any query containing bookmark lookup or RID lookup, it should be removed from the execution plan to improve performance. There are two different ways to remove bookmark/RID lookup.
Before we understand these two methods, we will create sample table without clustered index and simulate RID lookup. RID Lookup is a bookmark lookup on a heap that uses a supplied row identifier (RID).
USE tempdb
GO
-- Create Table OneIndex with few columns
CREATEÂ TABLE OneIndex (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
-- Insert One Hundred Thousand Records
INSERTÂ INTO OneIndex (ID,FirstName,LastName,City)
SELECTÂ TOP 100000Â ROW_NUMBER() OVER (ORDERÂ BY a.name) RowID,
'Bob',
CASE WHEN ROW_NUMBER() OVER (ORDERÂ BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%1000 = 1 THEN 'Las Vegas'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles'
ELSE 'Houston' END
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
Now let us run following select statement and check the execution plan.
SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO
As there is no index on table, scan is performed over the table. We will create a clustered index on the table and check the execution plan once again.
-- Create Clustered Index
CREATEÂ CLUSTEREDÂ INDEX [IX_OneIndex_ID] ON [dbo].[OneIndex] (
[ID] ASC
) ON [PRIMARY]
GO
Now, run following select on the table once again.
SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO
It is clear from execution plan that as a clustered index is created on the table, table scan is now converted to clustered index scan. In either case, base table is completely scanned and there is no seek on the table.
Now, let us see the WHERE clause of our table. From our basic observation, if we create an index on the column that contains the clause, a performance improvement may be obtained. Let us create non-clustered index on the table and then check the execution plan.
-- Create Index on Column City As that is used in where condition
CREATEÂ NONCLUSTEREDÂ INDEX [IX_OneIndex_City] ON [dbo].[OneIndex] (
[City] ASC
) ON [PRIMARY]
GO
After creating the non-clustered index, let us run our select statement again and check the execution plan.
SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO
As we have an index on the WHERE clause, the SQL Server query execution engine uses the non-clustered index to retrieve data from the table. However, the columns used in the SELECT clause are still not part of the index, and to display those columns, the engine will have to go to the base table again and retrieve those columns. This particular behavior is known as bookmark lookup or key lookup.
There are two different methods to resolve this issue. I have demonstrated both the methods together; however, it is recommended that you use any one of these methods for removing key lookup. I prefer Method 2.
Method 1: Creating non-clustered cover index.
In this method, we will create non-clustered index containing the columns, which are used in the SELECT statement, along with the column which is used in the WHERE clause.
CREATEÂ NONCLUSTEREDÂ INDEX [IX_OneIndex_Cover] ON [dbo].[OneIndex] (
City, FirstName, ID
) ON [PRIMARY]
GO
Once the above non-clustered index, which covers all the columns in query, is created, let us run the following SELECT statement and check our execution plan.
SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO
From the execution plan, we can confirm that key lookup is removed the only index seek is happening. As there is no key lookup, the SQL Server query engine does not have to go to retrieve the data from data pages and it obtains all the necessary data from index itself.
Method 2: Creating an included column non-clustered index.
Here, we will create non-clustered index that also includes the columns, which are used in the SELECT statement, along with the column used in the WHERE clause. In this method, we will use new syntax introduced in SQL Server 2005. An index with included nonkey columns can significantly improve query performance when all columns in the query are included in the index.
CREATEÂ NONCLUSTEREDÂ INDEX [IX_OneIndex_Include] ON [dbo].[OneIndex] (
City
) INCLUDE (FirstName,ID) ON [PRIMARY]
GO
From the execution plan, it is very clear that this method  also removes the key lookup as well.
In summary, Key lookup, Bookmark lookup or RID lookup reduces the performance of query, and we can improve the performance of query by using included column index or cover index.
I will cover few additional concepts related to the optimal method in another article.
Reference : Pinal Dave (https://blog.sqlauthority.com)
Related Post:
SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO
32 Comments. Leave new
Thanks Pinal for your very clear and helpful article!
Thanks. This was quite informative. I tried to apply it on one of my tables. Its a partitioned table and the where clause has to go from one side of the partition to the other side to fetch all the data required.
My query becomes something like this
select a.*
into ##TEST_RET
from RETURNS_TABLE a
join ##TEST_ID b on a.ID = b.ID and a.TODATE between b.STARTDATE and b.ENDDATE;
This RETURNS_TABLE has partitions. I tried both covered index and include index options, but none of them seem to help. Do you have any suggestions for the partitioned tables?
Thanks again!
Thanks. This was informative.
Nicely explained. Thanks.
My pleasure. Glad you liked it @Subhrajit
For Abhijat
You can create Unique clustered index(UID column,Partition column) with Nonclustered COLUMNSTORE index(table columns).
I’m using it in my DWH
Agree. Thanks Li-ad.
Thank you form such a simple and nice explanation.