Since the physical storage of data impacts the speed and efficiency of our queries, in tomorrow’s post we will explore how clustered indexes can impact the physical location of data and the way SQL Server retrieves query data. For today we will need to know the basics of the Clustered Index.
What is clustering or a clustered index? Let’s take a real life example. The store names in a shopping mall are not lined up in alphabetical order. In other words, Zales could be right next to Benetton. These two stores might occupy Units #410 and #411. Thus, the stores in a mall are clustered by unit number. The clustered index represents the actual physical order of your data. A paperback book (like Joes 2 Pros) is ordered by page numbers. If you locate page 99, then you know exactly where page 100 is. Thus, you could say the pages in a book represent a clustered index and they are clustered by page number.
Think of a memory page as a carton of eggs and think of each egg as a row of data. If you only had two eggs to store, then you would just need one carton. With just two eggs you have 10 available slots for when you insert more eggs. Eventually, you could acquire more eggs than could fit into one carton. What would you do then? You would fill up your first carton and get another one. Once you get to 13 eggs you need 2 cartons to hold them. Similarly, only so many records will fit into one memory page. If more records come in, then SQL Server will use more memory pages for that table.
While a table can have a maximum of just 1024 columns, or fields, it can have trillions of records. What makes a table take up space is the number of records in that table. If a table had no records, then it would take up very little space.
A page can hold up to 8K of data. If each record takes up 2K of data, then 4 records would fit in each datapage. If such a table had 12 records, then it would use 3 memory pages. In this pseudo example, 12 records from this table would fill up three memory pages.
In Volume 2 (SQL Queries Joes 2 Pros), you learned that with the use of an ORDER BY clause you can view a table’s records in any order you like. This is true regardless of the order in which the table has its rows stored. If you don’t use an ORDER BY clause, then you get the table’s natural sort order. What is a natural sort order? It is the sequence, from beginning to end, in which each row is stored in the table. You can tell a table how it should store its data or let the table just store data in order as it is entered.
Absent a clustered index or constraints (e.g., a primary key, a foreign key, etc.), the default order of records is the order in which they were entered into the table. The clustered index represents the actual physical order of your data but it does not guarantee order when retrieved.
If we decide to add a clustered index to this table based on the SSN field, then regardless of the order in which you insert these records, each record will be stored in order of SSN. If you insert SSN 888-88-8888 first and then later add 222-22-2222, SQL Server would physically reorder the records in storage so they line up by SSN.
NOTE: Please note that this is Level 100 explanation and how the rows are ordered and B Trees are organized are out of the scope of this article.
NOTE1: This article is just a introduction and not a deep dive into the storage structure. Here is the MSDN page for complete reference.
A heap has:
Please leave your answer in comment section below with correct option, explanation and your country of resident.
Every day one winner will be announced from United States.
Every day one winner will be announced from India.
A valid answer must contain country of residence of answerer.
Please check my facebook page for winners name and correct answer.
Winner from United States will get Joes 2 Pros Volume 3.
Winner from India will get Joes 2 Pros Volume 3.
The contest is open till next blog post shows up at http://blog.sqlauthority.com which is next day GTM+2.5.
Reference: Pinal Dave (http://blog.SQLAuthority.com)