SQL SERVER – Tips from the SQL Joes 2 Pros Development Series – Sparse Data and Space Used by Sparse Data – Day 17 of 35

Answer simple quiz at the end of the blog post and –
Every day one winner from India will get Joes 2 Pros Volume 3.
Every day one winner from United States will get Joes 2 Pros Volume 3.

Sparse Data

Fields with fixed length data types (e.g., int, money) always consume their allotted space irrespective of how much data the field actually contains. This is true even if the field is populated with a null.

Occasionally you will encounter a column in your database which is rarely used. For example, suppose you have a field called [Violation] in a table called [Employee] but very few employees have any violations – perhaps two or three for every 1000 employees. In this case, over 99% of the Violation field values are null. This would be an example of a sparsely populated field.

To demonstrate a sparsely populated field, we will create a simple table with the design shown here. Create and populate the dbo.Bonus table by running this code in the figures below.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_1

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_2

Please note that INSERT statements demonstrated will use row constructors, which debuted as a new feature in SQL Server 2008. However, readers who are running SQL Server 2005 must write their INSERT statements using the alternate syntax shown in the lower portion of the last Figure.

Now look at all the records in the table. Since all fields contain fixed length data types (i.e., int, char(2), money), we could have accurately calculated the per row consumption before we even added any data to the table.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_3

Each row of the Bonus table will consume 21 bytes based on the data below.

  • Row Header – 4 bytes (true for all rows).
  • Fixed Data – 14 bytes (BonusID is an int (4 bytes) and BonusAmount’s data type is money (8 bytes).
  • Null Block (aka Null Bitmap) – 3 bytes

The nulls in the money field do not change the space consumption because fixed length data types always use the full amount of space allocated to them. At 21 bytes per row, 1000 rows of the Bonus table would require 21,000 bytes and fill up about 3 data pages (1 data page = 8060 bytes). Currently there are only three records in this table so just 1 data page of 8K has started to fill up.

Analyzing Space Used

A handy tool for checking the storage amount which an object occupies is sp_spaceused. The figure below shows the Bonus table passed into this stored procedure. We see the Bonus table contains three rows and its data has not yet exceeded its first 8 KB page.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_4

If you added enough records the data space used would need more memory pages. Let’s run a loop to add 997 more records to the Bonus table. The first record populated will be row 4. This loop increments each subsequent BonusID value by 1, and the loop runs as long as the BonusID value is <=1000. Once the row containing BonusID 1000 has been entered into the table, the loop will end. One run the Bonus table contains 1000 records.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_6

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_7

Since one row occupies 21 bytes, we know these 1000 rows will take up 21 KB of space and should fit within three data pages.

[21000 bytes/(8060 bytes/page) = 2.61 pages]

Since you can use half a data page SQL dedicated 3 data pages to the Bonus table. Let’s rerun the sp_spaceused sproc and confirm the number of data pages.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_8

Using the Sparse Data Option

The sparse data option is a new SQL Server 2008 feature for fields you expect to be predominantly null. Using the sparse data option, you can instruct SQL Server to not have nulls consume space in sparsely populated fields.

To test this we’re going to delete the Bonus table and then re-create it using the same steps we took previously. The only difference will be that the BonusAmount field will be created using the sparse option. Recall we expect the BonusAmount field to contain very little actual data – most records will be null.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_9

Recall we expect the BonusAmount field to contain very little actual data – most records will be null. The Bonus table now contains 1000 records and BonusAmount is a sparse field, but so far we don’t see any difference in the appearance of the table or the data.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_10

Now let’s review how much space the Bonus table is consuming. Let’s confirm the space usage is less in this example. Had you not used the sparse option the consumption would have been 24KB like in our last example.

SQL SERVER - Tips from the SQL Joes 2 Pros Development Series - Sparse Data and Space Used by Sparse Data - Day 17 of 35 j2p_17_11

Note: If you want to setup the sample JProCo database on your system you can watch this video. For this post you will want to run the SQLArchChapter4.0Setup.sql script from Volume 3.

Question 17

You have two fields, of INT and MONEY, in your Bonus table. You have 1000 records and all instances of the money column are null. When you set up the money field, you used the Sparse option. How much space are the 1000 rows of the money field using?

  1. None
  2. 4000 bytes
  3. 8000 bytes

Do not forget to participate in special question over here: Pluralsight Giving Away Free Subscription to Quiz Participants

Rules:

Please leave your answer in comment section below with correct option, explanation and your country of resident.
Every day one winner will be announced from United States.
Every day one winner will be announced from India.
A valid answer must contain country of residence of answerer.
Please check my facebook page for winners name and correct answer.
Winner from United States will get Joes 2 Pros Volume 3.
Winner from India will get Joes 2 Pros Volume 3.
The contest is open till next blog post shows up at which is next day GTM+2.5.

Reference:  Pinal Dave (https://blog.sqlauthority.com)

Joes 2 Pros, SQL Scripts
Previous Post
SQL SERVER – Tips from the SQL Joes 2 Pros Development Series – System and Time Data Types – Day 16 of 35
Next Post
SQL SERVER – Geography Data Type – Calculating Distance Between Two Points on the Earth – Day 18 of 35

Related Posts

96 Comments. Leave new

  • Rene Alberto Castro Velasquez
    August 17, 2011 7:29 am

    Correct answer is No. 3: The field with MONEY datatype (8 bytes) has been set up using the Sparse option, and all instances are NULL. The field with INT datatype (4 bytes) is fixed-length, so 4 x 1000 is 4000 bytes. Each row has a row header of 4 bytes, so again 4 x 1000 is 4000 bytes. This makes a total of 8000 bytes.
    Rene Castro
    El Salvador

    Reply
  • shatrughna kumar
    August 17, 2011 7:29 am

    Correct option should be #3 if table structure is like
    (BonusID INT NOT NULL,BonusAmount MONEY NULL)

    Reply
  • The correct answer is option 1) none.
    1000 rows of the money field which is created using sparse option having the value null takes up no space at all.

    because,
    The SQL Server Database Engine uses the SPARSE keyword in a column definition to optimize the storage of values in that column. Therefore, when the column value is NULL for any row in the table, the value requires no storage.

    Thanks & Regards
    Santosh.S
    Bangalore, India

    Reply
  • You save 8000 bytes… From your example 1000 records with sparse option only use 16k as opposed to 24k with sparse option not used. Useful demo thanks

    Matt

    US

    Reply
  • Ramakrishnan Srinivasan
    August 17, 2011 7:53 am

    Hi,

    Since ALL values of the column with data-type MONEY are NULL, and the table was set up with the SPARSE option, the NULL values would consume NO SPACE.

    Thus, option 1 is right.

    Ramakrishnan RS
    Mysore, India

    Reply
  • Correct answer option no – 1) None

    shekhar Gurav.
    Country : INDIA

    Reply
  • Option 1) is correct
    1) None

    As you have explained the money field will not contribute to the space used as it is all null with sparse option

    Leo Pius
    USA

    Reply
  • Aditya Bisoi (@AdityaBisoi07)
    August 17, 2011 8:53 am

    Question 17

    Ans : None

    Chennai,INDIA

    Reply
  • Gopalakrishnan Arthanarisamy
    August 17, 2011 9:26 am

    Correct Answer is # 3.

    MONEY datatype takes up (8 Bytes) and has been set up using the SPARSE option, and all instances (1000) are NULL. SPARSE column does not take storage space for NULL rows.

    INT Datatype is fixed-length and takes (4 Bytes) = 4 * 1000 = 4000 Bytes.
    Each row has a row header of (4 Bytes) = 4 * 1000 = 4000 Bytes.

    So total bytes occupied by “Bonus” table is 8000 Bytes.

    Note: SPARSE option is set only for MONEY column.

    Gopalakrishnan Arthanarisamy
    Unisys, Bangalore, India

    Reply
  • Option 1 is the right answer

    Manoj Sahoo

    India

    Reply
  • Ans is 1) NONE as sparse column means do not use space.

    Thanks.

    Sudhir Chawla
    Delhi, India

    Reply
  • Hi,

    Correct answer is Option

    1)None

    explanation :- when “sparse” Used and data in filed is null then is will not take any space in data-base so it is going to be None.

    I am from india.

    Reply
  • Uday Kumar B R
    August 17, 2011 9:54 am

    Answer is
    1>Null

    Bangalore,India

    Reply
  • Option 3 is correct.

    INT field will consume 4 bytes and every row will consume 4 bytes.
    So 8000 Bytes.

    INDIA

    Reply
  • Since all the values of column BonusAmount is Null and it is assined with SPARSE the correct answer is
    NONE

    Mumbai :::INDIA

    Reply
  • Nikhil Mahajan
    August 17, 2011 10:18 am

    correct option is option 1 ie.
    1) None

    As you have explained the money field will not contribute to the space used as it is all null with sparse option

    so the naswer is 1) None
    India

    Reply
  • The correct answer, according to this article, is none (1st option). If all records have NULL in the Money field and we set up our table with SPARSE option, the 1000 rows will use 0 bytes for Money field.

    I am from USA

    Reply
  • Correct answer is option 1 that is

    None

    Because here we have used parse option while setting money field.

    Mahmad Khoja
    INDIA
    AHMEDABAD

    Reply
  • Rajneesh Verma
    August 17, 2011 10:43 am

    As we are setting up Money Column using Sparse then it will not consume memory. So the correct option is.
    1. None.

    Thanks….
    Rajneesh Verma
    (INDIA)

    Reply
  • Option 1) is correct
    1) None

    As you have explained the money field will not contribute to the space used as it is all null with sparse option

    Malay Shah
    India

    Reply

Leave a Reply