SQL SERVER – Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup

Today, I would like to share one very quick tip about how to remove bookmark lookup or RID lookup. Let us first understand Bookmark lookup or RID lookup. Please note that from SQL Server 2005 SP1 onwards, Bookmark look up is known as Key look up.

When a small number of rows are requested by a query, the SQL Server optimizer will try to use a non-clustered index on the column or columns contained in the WHERE clause to retrieve the data requested by the query. If the query requests data from columns not present in the non-clustered index, SQL Server must go back to the data pages to get the data in those columns. Even if the table has a clustered index or not, the query will still have to return to the table or clustered index to retrieve the data.

In the above scenario, if table has clustered index, it is called bookmark lookup (or key lookup); if the table does not have clustered index, but a non-clustered index, it is called RID lookup. This operation is very expensive. To optimize any query containing bookmark lookup or RID lookup, it should be removed from the execution plan to improve performance. There are two different ways to remove bookmark/RID lookup.

Before we understand these two methods, we will create sample table without clustered index and simulate RID lookup. RID Lookup is a bookmark lookup on a heap that uses a supplied row identifier (RID).

USE tempdb
GO
-- Create Table OneIndex with few columns
CREATE TABLE OneIndex (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100),
City VARCHAR(100))
GO
-- Insert One Hundred Thousand Records
INSERT INTO OneIndex (ID,FirstName,LastName,City)
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END,
CASE
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%1000 = 1 THEN 'Las Vegas'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino'
WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles'
ELSE 'Houston' END
FROM
sys.all_objects a
CROSS JOIN sys.all_objects b
GO

Now let us run following select statement and check the execution plan.

SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO

SQL SERVER - Query Optimization - Remove Bookmark Lookup - Remove RID Lookup - Remove Key Lookup keylook1

As there is no index on table, scan is performed over the table. We will create a clustered index on the table and check the execution plan once again.

-- Create Clustered Index
CREATE CLUSTERED INDEX [IX_OneIndex_ID] ON [dbo].[OneIndex] (
[ID] ASC
) ON [PRIMARY] GO

Now, run following select on the table once again.

SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO

SQL SERVER - Query Optimization - Remove Bookmark Lookup - Remove RID Lookup - Remove Key Lookup keylook2

It is clear from execution plan that as a clustered index is created on the table, table scan is now converted to clustered index scan. In either case, base table is completely scanned and there is no seek on the table.

Now, let us see the WHERE clause of our table. From our basic observation, if we create an index on the column that contains the clause, a performance improvement may be obtained. Let us create non-clustered index on the table and then check the execution plan.

-- Create Index on Column City As that is used in where condition
CREATE NONCLUSTERED INDEX [IX_OneIndex_City] ON [dbo].[OneIndex] (
[City] ASC
) ON [PRIMARY] GO

After creating the non-clustered index, let us run our select statement again and check the execution plan.

SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO

SQL SERVER - Query Optimization - Remove Bookmark Lookup - Remove RID Lookup - Remove Key Lookup keylook3

As we have an index on the WHERE clause, the SQL Server query execution engine uses the non-clustered index to retrieve data from the table. However, the columns used in the SELECT clause are still not part of the index, and to display those columns, the engine will have to go to the base table again and retrieve those columns. This particular behavior is known as bookmark lookup or key lookup.

There are two different methods to resolve this issue. I have demonstrated both the methods together; however, it is recommended that you use any one of these methods for removing key lookup. I prefer Method 2.

Method 1: Creating non-clustered cover index.

In this method, we will create non-clustered index containing the columns, which are used in the SELECT statement, along with the column which is used in the WHERE clause.

CREATE NONCLUSTERED INDEX [IX_OneIndex_Cover] ON [dbo].[OneIndex] (
City, FirstName, ID
) ON [PRIMARY] GO

Once the above non-clustered index, which covers all the columns in query, is created, let us run the following SELECT statement and check our execution plan.

SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO

From the execution plan, we can confirm that key lookup is removed the only index seek is happening. As there is no key lookup, the SQL Server query engine does not have to go to retrieve the data from data pages and it obtains all the necessary data from index itself.

SQL SERVER - Query Optimization - Remove Bookmark Lookup - Remove RID Lookup - Remove Key Lookup keylook4

Method 2: Creating an included column non-clustered index.

Here, we will create non-clustered index that also includes the columns, which are used in the SELECT statement, along with the column used in the WHERE clause. In this method, we will use new syntax introduced in SQL Server 2005. An index with included nonkey columns can significantly improve query performance when all columns in the query are included in the index.

CREATE NONCLUSTERED INDEX [IX_OneIndex_Include] ON [dbo].[OneIndex] (
City
) INCLUDE (FirstName,ID) ON [PRIMARY] GO

From the execution plan, it is very clear that this method  also removes the key lookup as well.

SQL SERVER - Query Optimization - Remove Bookmark Lookup - Remove RID Lookup - Remove Key Lookup keylook5

In summary, Key lookup, Bookmark lookup or RID lookup reduces the performance of query, and we can improve the performance of query by using included column index or cover index.

I will cover few additional concepts related to the optimal method in another article.

Reference : Pinal Dave (https://blog.sqlauthority.com)

Related Post:

SQL SERVER – Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup

SQL SERVER – Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 2

SQL SERVER – Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 3

SELECT ID, FirstName
FROM OneIndex
WHERE City = 'Las Vegas'
GO

SQL Index, SQL Scripts
Previous Post
SQL SERVER – Interesting Observation – Query Hint – FORCE ORDER
Next Post
SQL SERVER – Query Optimization – Remove Bookmark Lookup – Remove RID Lookup – Remove Key Lookup – Part 2

Related Posts

32 Comments. Leave new

  • No need to add ID to your nonclustered index… that’s redundant. Since you have a clustered index on ID, then ID will automatically be included in any nonclustered index you create (so that it could do the bookmark lookup).

    So your nonclustered index examples would be to either create an index on (City, FirstName) or an index on (City) INCLUDE (FirstName).

    –Brad

    Reply
  • I have a table customer in which cstid of BIGINT data type with identity. As table have identity column I do not want any type of index on it. Being cstid as primary key of table, is it possible without any index?

    Reply
  • Dear Pinal,

    good post, very clear sample, but I would like to see a remark, that Key-Lookups could be in some situations, special with very wide tables, and the SELECT includes a lot or all of the columns, the much much better way. Think about a table containing 11.000.000 customers with 200 columns an the avg size of a row of 30kb. The SELECT need to return the whole row-data. Seach is done within the lastname, no leading placeholder to be able to use an Index. So an additional small index only containing the lastname and the key and then doing an Key Lookup is much much faster instead of using the index, even clustered index where every row nees severals data blocks. In case of IO we see differences in the factor of 500 times more IO.

    Regards,
    Michael

    Reply
    • Hi pinal,

      We have a requirement to Merge six 12millions data tables into single Table.All the tables are having duplicate Emails.So We put the Primary key for email column using EnterPrise Manager and We r trying to merge by writing the following insert query, But it’s taking more than 20hrs…:-(

      Insert into Total( Email,FirstName,LastName,Address,City,State,Zipcode,Gender,DOB ,
      Phone,WebAddress1,Webaddress2,IPAddress,DateTime,Interest )
      Select Email,FirstName,LastName,Address,City,State,Zipcode,Gender,DOB ,
      Phone,WebAddress1,Webaddress2,IPAddress,DateTime,Interest
      From [Set4]
      Where
      (not exists(select email from Total
      where Total.email=[Set4].email))

      How can we do this most efficiently??? Please Help Us……

      Thanks in advance…

      Reply
  • Hi Pinal,
    Nice explanation of the covering index concept. What is your take on the comment posted by Michael about the customer table situation, where one needs to return a lot of columns.

    Thank you

    Reply
  • Pinal,

    Why not just remove, from the query, the column that’s causing the key lookup. This is really much simpler I’m sure you’ll agree. You’ve written about best practices for database design in the past, and taught me that fewer columns per table are always better. So if you’re getting lots of key lookups, it follows that we should split up our tables and make them smaller. This is the query optimizer’s way of telling us that we have messed up our design!

    Thank you for continuing to educate us on how best to use our database.

    Reply
  • But what if you don’t need those columns. Then you can remove them. This example proves that you shouldn’t add columns to a query that you don’t need. That’s never a best practice (at least in my experience). You revealed in another post that we should keep indexes as narrow as possible, and I have been following this rule religiously! Since starting down this path I have managed to get almost every index in our database down to a single column. So far this has worked very nicely as far as I can tell, but I’m not really sure how to collect query performance numbers. Would you mind helping me so that I can send you a report and you can show readers what great progress can be made if we follow your methods.

    Reply
  • Hi Pinal,

    I’m curious to know why you have included the ID column in the covering index, as well as in the nc index with the includes. I thought all columns in the Clustered index were already included in the non clustered index.

    Is there a reason why you did that?

    Thanks,

    Reply
  • And these new notes were not just plain bank notes. ,

    Reply
  • Awesome. Managed to knock down a query from 5-6 minutes down to 5 seconds or so.

    Reply
  • Excellent Post!!! Thank you for the information. This helps me improve my SP from 5 seconds to 0.xx second. Thanks alot!

    Reply
  • Sridhar Nalluri
    June 15, 2010 2:13 pm

    Hi Pinal,

    Nice post, very impressive and understandable…Thank You.

    Reply
  • Nick Duckstein
    November 2, 2010 5:23 am

    Dave,

    I’m confused why the best recommendation isn’t to create a clustered index. In our situation we have a 5 Billion row table without a clustered index. The RID lookups are killing us. The need for a lot of extra non-clustered indexes is also killing performance because this is on a replication server.

    Finally the writes to a clustered index are dramatically less than to a non-clustered index with a HEAP.

    See: https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2005/administrator/cc917672(v=technet.10)#EFAA

    How can anyone justify a large table without a clustered index?

    Nick

    Reply
  • HI Pinal,

    Thanks for this post.Its very easy to understand.Good post.

    Reply
  • Hi,
    If you are saying, if table has clustered index, it is called bookmark lookup (or key lookup); if the table does not have clustered index, but a non-clustered index, it is called RID lookup then you again wrote that as we have an index on the WHERE clause, the SQL Server query execution engine uses the non-clustered index to retrieve data from the table. However, the columns used in the SELECT clause (id, firstname) are still not part of the index that means (neither cluster index nor non-cluster index has applied on it)and to display those columns, the engine will have to go to the base table again and retrieve those columns. so how we can say this particular behavior is known as bookmark lookup or key lookup

    Reply
  • Girijesh Pandey
    October 2, 2012 1:30 pm

    Hi Pinal,

    Nice explanation, too easy to understand.

    i have one question over here, let me write your code what you have used.

    SELECT ID, FirstName
    FROM OneIndex
    WHERE City = ‘Las Vegas’

    CREATE NONCLUSTERED INDEX [IX_OneIndex_Include] ON [dbo].[OneIndex]
    (
    City
    ) INCLUDE (FirstName,ID) ON [PRIMARY]
    GO

    As i can see in above code, you are selecting ID then FirstName columns in Select list, but while including these columns for Non Clustered Index you have not followed the same sequence/order (Included Firstname then ID)

    Is this really fruitful to improve query performance?

    As per my understanding, we should follow the same sequence as we are selecting the fields, same sequence should be followed while including in Index, It would be easy for SQL Server to get the data in fast way.

    Regards,
    Girijesh

    Reply
  • Hi,
    great article, i find it very helpful…
    I have a question… what happens when the output list of a key lookup is NULL? what is it looking for?

    Reply
  • Hello Pinal,

    Very clear and informative. Understanding tuning is tricky but this post made it seem simple.

    My question is, when creating the composite index, why do you specify city first, rather than ID, which is more unique.

    Thanks,
    Steven

    Reply
  • Thanks Pinal for your very clear and helpful article!

    Reply
  • Thanks. This was quite informative. I tried to apply it on one of my tables. Its a partitioned table and the where clause has to go from one side of the partition to the other side to fetch all the data required.
    My query becomes something like this

    select a.*
    into ##TEST_RET
    from RETURNS_TABLE a
    join ##TEST_ID b on a.ID = b.ID and a.TODATE between b.STARTDATE and b.ENDDATE;
    This RETURNS_TABLE has partitions. I tried both covered index and include index options, but none of them seem to help. Do you have any suggestions for the partitioned tables?

    Thanks again!

    Reply
  • Thanks. This was informative.

    Reply

Leave a Reply