Today, I would like to share one very quick tip about how to remove bookmark lookup or RID lookup. Let us first understand Bookmark lookup or RID lookup. Please note that from SQL Server 2005 SP1 onwards, Bookmark look up is known as Key look up.
When a small number of rows are requested by a query, the SQL Server optimizer will try to use a non-clustered index on the column or columns contained in the WHERE clause to retrieve the data requested by the query. If the query requests data from columns not present in the non-clustered index, SQL Server must go back to the data pages to get the data in those columns. Even if the table has a clustered index or not, the query will still have to return to the table or clustered index to retrieve the data.
In the above scenario, if table has clustered index, it is called bookmark lookup (or key lookup); if the table does not have clustered index, but a non-clustered index, it is called RID lookup. This operation is very expensive. To optimize any query containing bookmark lookup or RID lookup, it should be removed from the execution plan to improve performance. There are two different ways to remove bookmark/RID lookup.
Before we understand these two methods, we will create sample table without clustered index and simulate RID lookup. RID Lookup is a bookmark lookup on a heap that uses a supplied row identifier (RID).
USE tempdb
GO-- Create Table OneIndex with few columnsCREATE TABLE OneIndex (ID INT,FirstName VARCHAR(100),LastName VARCHAR(100),City VARCHAR(100))GO-- Insert One Hundred Thousand RecordsINSERT INTO OneIndex (ID,FirstName,LastName,City)SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,'Bob',CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'ELSE 'Brown' END,CASEWHEN ROW_NUMBER() OVER (ORDER BY a.name)%1000 = 1 THEN 'Las Vegas'WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 1 THEN 'New York'WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 5 THEN 'San Marino'WHEN ROW_NUMBER() OVER (ORDER BY a.name)%10 = 3 THEN 'Los Angeles'ELSE 'Houston' END
FROM sys.all_objects aCROSS JOIN sys.all_objects b
GO
Now let us run following select statement and check the execution plan.
SELECT ID, FirstNameFROM OneIndexWHERE City = 'Las Vegas'GO
As there is no index on table, scan is performed over the table. We will create a clustered index on the table and check the execution plan once again.
-- Create Clustered IndexCREATE CLUSTERED INDEX [IX_OneIndex_ID] ON [dbo].[OneIndex]([ID] ASC) ON [PRIMARY]
GO
Now, run following select on the table once again.
SELECT ID, FirstNameFROM OneIndexWHERE City = 'Las Vegas'GO
It is clear from execution plan that as a clustered index is created on the table, table scan is now converted to clustered index scan. In either case, base table is completely scanned and there is no seek on the table.
Now, let us see the WHERE clause of our table. From our basic observation, if we create an index on the column that contains the clause, a performance improvement may be obtained. Let us create non-clustered index on the table and then check the execution plan.
-- Create Index on Column City As that is used in where conditionCREATE NONCLUSTERED INDEX [IX_OneIndex_City] ON [dbo].[OneIndex]([City] ASC) ON [PRIMARY]
GO
After creating the non-clustered index, let us run our select statement again and check the execution plan.
SELECT ID, FirstNameFROM OneIndexWHERE City = 'Las Vegas'GO
As we have an index on the WHERE clause, the SQL Server query execution engine uses the non-clustered index to retrieve data from the table. However, the columns used in the SELECT clause are still not part of the index, and to display those columns, the engine will have to go to the base table again and retrieve those columns. This particular behavior is known as bookmark lookup or key lookup.
There are two different methods to resolve this issue. I have demonstrated both the methods together; however, it is recommended that you use any one of these methods for removing key lookup. I prefer Method 2.
Method 1: Creating non-clustered cover index.
In this method, we will create non-clustered index containing the columns, which are used in the SELECT statement, along with the column which is used in the WHERE clause.
CREATE NONCLUSTERED INDEX [IX_OneIndex_Cover] ON [dbo].[OneIndex](City, FirstName, ID) ON [PRIMARY]
GO
Once the above non-clustered index, which covers all the columns in query, is created, let us run the following SELECT statement and check our execution plan.
SELECT ID, FirstNameFROM OneIndexWHERE City = 'Las Vegas'GO
From the execution plan, we can confirm that key lookup is removed the only index seek is happening. As there is no key lookup, the SQL Server query engine does not have to go to retrieve the data from data pages and it obtains all the necessary data from index itself.
Method 2: Creating an included column non-clustered index.
Here, we will create non-clustered index that also includes the columns, which are used in the SELECT statement, along with the column used in the WHERE clause. In this method, we will use new syntax introduced in SQL Server 2005. An index with included nonkey columns can significantly improve query performance when all columns in the query are included in the index.
CREATE NONCLUSTERED INDEX [IX_OneIndex_Include] ON [dbo].[OneIndex](City) INCLUDE (FirstName,ID) ON [PRIMARY]
GO
From the execution plan, it is very clear that this method also removes the key lookup as well.
In summary, Key lookup, Bookmark lookup or RID lookup reduces the performance of query, and we can improve the performance of query by using included column index or cover index.
I will cover few additional concepts related to the optimal method in another article.