SQL SERVER – Tips from the SQL Joes 2 Pros Development Series – Data Row Space Usage and NULL Storage – Day 15 of 35

Answer simple quiz at the end of the blog post and -
Every day one winner from India will get Joes 2 Pros Volume 3.
Every day one winner from United States will get Joes 2 Pros Volume 3.

Data Row Space Usage

Most of a table’s space is occupied by its records. Indexes and other properties use a relatively small amount of known space for the table.  Suppose your company – or a hiring manager – shows you the design of the SalesInvoiceDetail table and says, “We expect this table to receive an average of 100,000 records per day during the next two years. How much hard drive space should we purchase to handle this expected growth?” You know how many rows will be received in a day and how many days there are in a year.  The unknown in this scenario is the amount of space each row will use. If you calculate the amount of space each row needs, you can then answer this resource planning question for the new table.

100,000 rows/day * 365 days    =  36,500,000 rows

36,500,000 rows   * __ KB/row = ____ KB of storage space needed

Our calculations will similarly focus on data rows. In order to estimate a row’s space consumption, we must know the amount of space each field’s data will use.

There are three key components which contribute to a field’s space consumption:

· The data type
· Whether the data type is fixed or variable
· Whether the field is nullable

Common Data Types

The names and storage amounts per field for many commonly used data types are shown here.

Exact numeric data types
int (integer)                    4 bytes
bigint                           8 bytes
smallint                        2 bytes
tinyint                          1 byte
money                                     8 bytes
smallmoney                 4 bytes
decimal                        5-17 bytes, depending on the number of digits
numeric                       5-17 bytes, depending on the number of digits
bit                                1-8 bit fields use 1 byte; 9-16 bit fields use 2 bytes; etc.

Row Header

Every row has a 4 byte header. This is a positioning header which keeps track of where the row is placed in the table and which fields the row contains.

Update from Paul Randal (Blog | Twitter): The 4 byte row header has nothing to do with the position of the record or which columns are in the record. It contains two bytes (one for index records) that say what kind of record it is, plus the offest of the null bitmap. The nulll bitmap is always present in data records, regardless of whether the columns are nullable or not – unless the table is comprised solely of sparse columns.

Fixed Data

Fixed length data types always occupy the amount of space allotted to them. For example, an int will always use 4 bytes. A char(3) always takes up 3 bytes, even if the field contains just 1 or 2 characters. Fixed length data is predictable and the easiest type of data for SQL Server to manage. Calculations involving fixed data are straightforward.  However, variable length data incurs additional overhead.

Variable Block

Every record containing variable length data includes something called a variable block. The first time you create a field with a variable length data type (e.g., varchar, nvarchar), the variable block is created. This block keeps track of the number of variable length data fields within the record and takes up 2 bytes.

The more variable length fields you have, the bigger the variable block grows. Each variable length field adds another 2 bytes to the block. (These 2 bytes keep track of where the data is positioned within the row.) For example, if you have one varchar field in your table, your variable block would contain 4 bytes for each record in the table. If you have two varchar fields, your variable block would contain 6 bytes (2 bytes per field plus 2 bytes to set this up).

Variable Data

Variable length data types do pretty much what the name implies – they expect the data length to vary from row to row. Fields using variable length data types, such as varchar and nvarchar, are typically name or address fields where you aren’t certain how long the data will be.

The advantage these data types offer is that shorter names or addresses can take up less storage space than a fixed data type. For example, if you have a char(100) field to allow for long addresses, then that field always uses 100 bytes no matter how long the data actually is. However, if you know that most of your addresses will consist of 20 characters, you probably would choose a varchar(100) to use less storage space but retain the flexibility to accept addresses up to 100 characters in length.

Varchar data consumes 1 byte per character. Nvarchar data consumes 2 bytes per character, because it is Unicode. For example a varchar(10) field containing the name “Rick” would consume 4 bytes. If it were an nvarchar, it would use 8 bytes.

This is one difference between fixed and varying length data. With fixed data, you can calculate the storage consumption without needing to look at the actual data. However, to precisely calculate how much storage a row or table is actually utilizing, you would need to examine the length of data in each of the variable data fields (the LEN( ) function is shown in the figure below.

Now let’s bring in the design interface for the RoomChart table, which we will use to calculate the actual space consumption for rows in the RoomChart table. We will look at the table design along with our LEN( ) query result so that we have the length measurements for the RoomName field.

Let’s calculate the actual space consumption for Row 1 of the RoomChart table. We will begin with the fixed length data. Each of the four rows contains two fixed length fields. The ID field uses 4 bytes and the field named “Code” uses 3 bytes. Each row also has a 4 byte header. Thus, without looking at the data, we already know each row uses at least 11 bytes.

Header + Fixed Length Fields (ID and Code fields)

[4 bytes + 4 bytes  + 3 bytes      = 11 bytes]

The final field (RoomName) contains variable length data, so in order to evaluate the space consumption we must: 1) calculate the variable block; and 2) look at the actual variable length data.

Since there is one variable field per row, we must allow 2 bytes for the creation of the variable block. Then we must multiply the number of variable field(s) in the row by 2 bytes.

Variable Block

[2 bytes + (1 field * 2 bytes/field) = 4 bytes]

Actual Data

[Renault-Langsford-Tribute, 25 unicode chars = 50 bytes]

Header 4 | Fixed Data 7 | Variable Block 4 | Variable Data 50 = 65 bytes

Null Data

One important piece of the storage calculation we haven’t yet considered is the null block. Somewhat like variable length data fields, each record in a table containing nullable field(s) uses a little extra storage space.

Null Block

Each record begins with a standard 4 byte row header. Right after the row header, the first item in the data portion of the record is the fixed data. SQL Server stores together all of the columns containing fixed width data.

If your table contains nullable data, then a null block follows the fixed data and occupies the third space in the physical structure of the record. (Without the null block, the usual order prevails – #1 Row Header, #2 Fixed Data, #3 Variable Block, and #4 Variable Data payload.)

The null block (also called the null bitmap) is created as soon as a nullable field is created in a table. The null block in each row begins as 2 bytes but may grow as you add more fields to your table.

Next you must count the total number of columns in the table. Add an additional byte to the row’s null block for the first field and another byte for every 8th field. In other words, if a table has between 1-8 fields, then the null block in each row will be 3 bytes. If the table contains 9-16 fields, then the null block will be 4 bytes per row. If the table contains 17-24 fields, then the null block will be 5 bytes per row, and so forth.

These additional bytes contain an indication for each column’s nullability. In other words, whether the column will allow nulls (e.g., Code, RoomName) or won’t allow nulls (e.g., the ID column in the RoomChart table).

Null Block Storage Allocation

It often surprises people to know that it only takes one nullable field to cause every field in the table to take up 1 bit of extra space in the null block. These additional bits keep track of each column whether it does or doesn’t contain a null. The following diagram illustrates two tables:  one in which every column is nullable and one containing only a single nullable column.

Each table contains 10 columns, c1 through c10. Since there are 10 columns and each table contains at least one nullable column, 10 additional bits are needed in the null block. The null block will contain a 2-byte fixed length field and a variable length bitmap of 1 bit per column. Memory is allocated in 8-bit bytes and 10 bits (c1 – c10) crosses into the next byte.  The variable length bitmap takes up 2 more bytes, bringing the size to 4 bytes for each tables’ null block.

Note that in table t1, an INSERT statement placed the integer values 1 through 10 in columns c1 through c10, respectively. Since none of these values is null, the null block bitmap contains 0’s for columns c1 through c10. On the right half of this figure, you see these bits are located in byte 1 for columns c1 through c8, and byte 2 for columns c9 and c10. The additional 6 bits of byte 2 in the null block bitmap are not used by any columns. In table t2 we have one null value in the last column. An INSERT statement placed the integer values 1 through 9 in columns c1 through c9, and null in column c10. Columns c1 through c9 are not null, so the null block bitmap contains 0’s (zeros) for those columns. Column c10 does contain null, so its bit in the null block bitmap reflects a 1.

In a table that contains at least one nullable column, each row will contain a null block whose length depends on the number of columns in the table. If a row of such a table contains a null value for a nullable column, its bit in that row’s null block bitmap will be set to 1. Columns which are not null have their bits in the row’s null block bitmap set to 0.

Recall that we calculated the space consumption for Row 1 as 65 bytes.

Actual Data

[Renault-Langsford-Tribute, 25 unicode chars = 50 bytes]

Header 4 | Fixed Data 7 | Variable Block 4 | Variable Data 50 = 65 bytes

We know a null block is needed, because there are nullable fields in the RoomChart table. There are two nullable fields, Code and RoomName.

Creating the null block uses 2 bytes. Then you must count the total number of columns in the table. Since this table contains 3 columns, 1 byte is added to the null block.

Null Block

[2 bytes + 1 byte (only 3 fields)] = 3 bytes]

So we must add 3 bytes to our original storage calculation for Row 1:

Header 4 | Fixed Data 7 | Null Block 3 | Variable Block 4 | Variable Data 50 = 68 bytes

Thus, the full amount of space used by Row 1 of the RoomChart table is 68 bytes. Now let’s recalculate the second row’s space usage including the null block. Recall we calculated the space consumption for Row 2 as 53 bytes.

Actual Data

[Quinault-Experience, 19 unicode chars = 38 bytes]

Header 4 | Fixed Data 7 | Variable Block 4 | Variable Data 38 = 53 bytes

Since the null block for each record in the table will be the same size, we know the null block for Row 2 will be the same as Row 1:  3 bytes.

Null Block

[2 bytes + 1 byte (only 3 fields)] = 3 bytes]

So we must add 3 bytes to our original storage calculation for Row 2:

Header 4 | Fixed Data 7 | Null Block 3 | Variable Block 4 | Variable Data 38 = 56 bytes

Note: If you want to setup the sample JProCo database on your system you can watch this video. For this post you will want to run the SQLArchChapter4.0Setup.sql script from Volume 3.

Question 15

You have three variable length data fields. What are the rules that go into the calculation of how large the variable block will be (Choose two)?

  1. You will allocate 2 bytes to the creation of the variable block
  2. You will allocate 3 bytes to the creation of the variable block
  3. You will allocate 2 more bytes for each of the three variable fields
  4. You will allocate 1 byte for every eight columns in the table.

Rules:

Please leave your answer in comment section below with correct option, explanation and your country of resident.
Every day one winner will be announced from United States.
Every day one winner will be announced from India.
A valid answer must contain country of residence of answerer.
Please check my facebook page for winners name and correct answer.
Winner from United States will get Joes 2 Pros Volume 3.
Winner from India will get Joes 2 Pros Volume 3.
The contest is open till next blog post shows up at http://blog.sqlauthority.com which is next day GTM+2.5.

Reference:  Pinal Dave (http://blog.SQLAuthority.com)

About these ads

61 thoughts on “SQL SERVER – Tips from the SQL Joes 2 Pros Development Series – Data Row Space Usage and NULL Storage – Day 15 of 35

  1. Correct asnwers are No. 1 and No. 3, because 2 bytes are needed to create a variable block (answer No. 1), and 2 bytes are needed for each variable field (answer No. 3)
    Rene Castro
    El Salvador

    Like

  2. Some corrections:
    The 4 byte row header has nothing to do with the position of the record or which columns are in the record. It contains two bytes (one for index records) that say what kind of record it is, plus the offest of the null bitmap.
    The nulll bitmap is always present in data records, regardless of whether the columns are nullable or not – unless the table is comprised solely of sparse columns.

    Like

  3. Pingback: SQL SERVER – Win a Book a Day – Contest Rules – Day 0 of 35 Journey to SQLAuthority

  4. 1) and 3)
    1) You will allocate 2 bytes to the creation of the variable block
    3) You will allocate 2 more bytes for each of the three variable fields

    The variable block is takes 2 bytes first time. Each of the variable fields will add 2 bytes each,

    Like

  5. Creation of the variable block will take up 2 bytes, and every variable block will subsequently add 2 bytes to this amount. Hence,

    2 bytes + (3 variable-length-fields * 2 bytes per variable-length-field) = 8 bytes

    Options 1 and 3 support the above calculation, and therefore, they are correct

    Ramakrishnan RS,
    Mysore, India

    Like

  6. Correct answers are option 1 & 3

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    because Variable block requires 2 bytes for the creation and another 2 more bytes for each of the three variable fields to set this up.

    Formula
    [2 bytes (for creation) + (1 fields * 2 bytes/field) = 4 bytes]

    As the calculation is for three variable length data fields, the total number of bytes would be

    [2 bytes (for creation) + (3 fields * 2 bytes/field) = 8 bytes] is the answer

    Thanks.

    Country – India

    Like

  7. The correct answers are #1 and #3
    You will allocate 2 bytes to the creation of the variable block
    You will allocate 2 more bytes for each of the three variable fields

    I am from USA

    Like

  8. The correct answers are Options-1 and 3
    1).You will allocate 2 bytes to the creation of the variable block
    3).You will allocate 2 more bytes for each of the three variable fields

    Like

  9. Hi,
    Option 1 and 3 are correct Answers.
    1.) You will allocate 2 bytes to the creation of the variable block
    3.) You will allocate 2 more bytes for each of the three variable fields

    Thanks…
    Rajneesh Verma
    (INDIA)

    Like

  10. The correct answers are #1 & #3.

    Reason: As has been put up very well in the article: “2 bytes per field plus 2 bytes to set this up”:
    1. You will allocate 2 bytes to the creation of the variable block
    3. You will allocate 2 more bytes for each of the three variable fields

    Country of residence: India

    On a side-note:
    Pinal: This series is absolutely wonderful. Especially today’s article. I was aware about the high-level storage concepts, but not about how fixed-length, NULL blocks and variable-length fields are arranged on the row.

    I got to learn a great deal today, and I thank-you and the other writes of the “Joes 2 Pros” series.

    Like

  11. Hi,

    1) You will allocate 2 bytes to the creation of the variable block

    3) You will allocate 2 more bytes for each of the three variable fields

    Base on this bellow equation above 2 Choose is correct.

    [2 bytes + (1 field * 2 bytes/field) = 4 bytes]

    [2 bytes + (3 field * 2 bytes/field) = 8 bytes]

    Correct ans. is : – 1 & 3

    I am from Ahmedabad,Gujarat,India

    Like

  12. Options 1 and 3 are correct.
    1.) You will allocate 2 bytes to the creation of the variable block
    3.) You will allocate 2 more bytes for each of the three variable fields

    Like

  13. Correct answer is Option 1 and 3.

    1.You will allocate 2 bytes to the creation of the variable block.
    3.You will allocate 2 more bytes for each of the three variable fields

    Country:India

    Thanks,
    Fazal Vahora

    Like

  14. Hi Sir,

    The correct options are 1 and 3
    1. we should allocate 2 bytes to the creation of the variable block
    and
    3. we will allocate 2 more bytes for each of the three variable fields.

    so the calculation goes like this :

    We have three variable fields so

    [2 bytes + (3 field * 2 bytes/field) = 8 bytes]

    Option 2 is incorrect as we allocate only 2 bytes for the creation of the variable block.
    Option 4 doesn’t satisfy the rules.

    P.Anish Shenoy,
    INDIA,Bangalore,Karnataka

    Like

  15. Correct answer is 1) and 3)
    1) You will allocate 2 bytes to the creation of the variable block
    3) You will allocate 2 more bytes for each of the three variable fields

    The variable block is takes 2 bytes first time. Each of the variable fields will add 2 bytes each,

    My country of residence is UK

    Like

  16. 1) and 3)
    1) You will allocate 2 bytes to the creation of the variable block
    3) You will allocate 2 more bytes for each of the three variable fields

    The variable block is takes 2 bytes first time. Each of the variable fields will add 2 bytes each,

    I’m from INDIA

    Like

  17. Hi Pinal,

    Challenge:
    Question 15
    You have three variable length data fields. What are the rules that go into the calculation of how large the

    variable block will be (Choose two)?

    1.You will allocate 2 bytes to the creation of the variable block
    2.You will allocate 3 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields
    4.You will allocate 1 byte for every eight columns in the table.

    Correct Answer:
    The correct answer is #1 and #3.

    Explanation:
    If you have any variable length fields, a Variable Block is created, consisting of 2 bytes initially. For each variable length field, 2 additional bytes are used. In this example, 3 variable length fields will result in a Variable Block of 8 byes.

    #2 is incorrect. You need to allocate 2 bytes for the creation of the Variable Block, not 3.

    #4 is referring to the algorithm for calculating how many bytes are needed in the Null Block, besides the initial 2 bytes.

    Country:
    United States

    Thanks for the knowledge!

    Regards,

    Bill Pepping

    Like

  18. To obtain the correct answer, the calculation must take into account both 1 and 3:

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    Country: United States

    Like

  19. Answer is #1 & #3

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    per your article that states “Since there is one variable field per row, we must allow 2 bytes for the creation of the variable block. Then we must multiply the number of variable field(s) in the row by 2 bytes.”

    Deb- USA

    Like

  20. Ans is 1 and 3
    1) You will allocate 2 bytes to the creation of the variable block
    3) You will allocate 2 more bytes for each of the three variable fields

    Creation of the variable block will take up 2 bytes, and every variable block will subsequently add 2 bytes to this amount.

    2 bytes + (1 field * 2 bytes/field) = 4 bytes
    2 bytes + (3 variable-length-fields * 2 bytes per variable-length-field) = 8 bytes

    Partha
    India

    Like

  21. The Correct Options for the question “You have three variable length data fields. What are the rules that go into the calculation of how large the variable block will be” is

    Option 1 ) and Option 3)

    1) You will allocate 2 bytes to the creation of the variable block

    AND

    3) You will allocate 2 more bytes for each of the three variable fields

    Why 1 and thee are correct

    We have any variable length fields, a Variable Block is created, consisting of 2 bytes starting. Now for every variable length field, 2 additional bytes are used. In this query, 3 variable length fields will result in a Variable Block of 8 byes.

    why Others are wrong

    Option 2) is invalid because we require to allocate 2 bytes for the creation of the Variable Block but not 3.

    Option 4) This Option is referring to the algorithm for calculating how many bytes are needed in the Null Block along with the initial 2 bytes.

    Diljeet Kumari
    Country : India

    Like

  22. Option 1 and 3 are correct

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    These two options will provide the desire result.
    (Sale, Nigeria)

    Like

  23. 1 and 3.

    The variable block will be a total of 8 bytes, for three variable length columns. 2 bytes to create the block, plus 2 bytes for each variable length column.

    Like

  24. 1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    Gordon Kane
    Allen TX
    USA

    Like

  25. 1. You will allocate 2 bytes to the creation of the variable block.
    3 .You will allocate 2 more bytes for each of the three variable fields

    Chetan – USA

    Like

  26. The correct answers are #1 and #3
    You will allocate 2 bytes to the creation of the variable block
    You will allocate 2 more bytes for each of the three variable fields
    but pinal how we see practically clearly
    kkmishra
    India

    Like

  27. Hi Pinal Sir,

    The Correct Answer for the above question is Options 1 and Option 3.

    option1)You will allocate 2 bytes to the creation of the variable block + ” and ” +
    Option 3) You will allocate 2 more bytes for each of the three variable fields

    Explanation Why Option #1 and #3 combined are correct

    We require variable length fields and a Variable Block created which is consisting of 2 bytes at the start.

    Now here every variable length field two additional bytes are added. In this query, 3 variable length fields will result in adding Variable Block of 8 byes.

    why Others are wrong :

    Option 2) Here we need to allocate 2 bytes for the creation of the Variable Block but not 3 hence not correct.

    Option 4) This query is referring for the algorithm for calculating how many
    bytes are needed in the Null Block along with the initial 2 bytes hence Not valid.

    A very happy Independence Day to you !!!

    DILIP KUMAR JENA
    Country : INDIA

    Like

  28. The answer is 1 and 3.

    The first time you create a field with a variable length data type (e.g., varchar, nvarchar), the variable block is created.

    Each variable length field adds another 2 bytes to the block.

    I love you Pinal Dave.

    David
    USA

    Like

  29. Answer: 1 and 3.

    If there are any variable length data type in the record, it will have 2 byte variable block for the creation of the variable block.
    For each variable data column in the record, it will take additional 2 bytes in the record.

    Damodaran Venkatesan
    Country: USA.

    Like

  30. Correct answers are 1 and 3

    You will allocate 2 bytes to the creation of the variable block
    You will allocate 2 more bytes for each of the three variable fields

    USA

    Like

  31. Correct answers are # 1 and # 3.

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    Because, “Variable Block” requires 2 bytes for the creation and another 2 more bytes for each of the three variable fields to set this up.

    Formula
    [2 bytes (for creation) + (1 fields * 2 bytes/field) = 4 bytes]

    As the calculation is for three variable length data fields, the total number of bytes would be

    [2 bytes (for creation) + (3 fields * 2 bytes/field) = 8 bytes] is the answer.

    Gopalakrishnan Arthanarisamy
    Unisys, Bangalore, India

    Like

  32. Correct answers are:

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    Country: India

    Like

  33. Option 1 and 3 are correct

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    Rajesh Garg
    India

    Like

  34. Correct answers are 1 and 3

    1.You will allocate 2 bytes to the creation of the variable block.
    3.You will allocate 2 more bytes for each of the three variable fields

    Country : India

    Like

  35. Correct answers are option 1 & 3

    1.You will allocate 2 bytes to the creation of the variable block
    3.You will allocate 2 more bytes for each of the three variable fields

    Like

  36. the correct option is –1) and 3)
    1) You will allocate 2 bytes to the creation of the variable block
    3) You will allocate 2 more bytes for each of the three variable fields

    The variable block is takes 2 bytes first time. Each of the variable fields will add 2 bytes each,

    india

    Like

  37. The correct answers are Options-1 and 3

    1).You will allocate 2 bytes to the creation of the variable block
    3).You will allocate 2 more bytes for each of the three variable fields

    Somnath Desai

    India

    Like

  38. Pingback: SQL SERVER – SSQL Architecture Basics – Core Architecture Concepts – Book Available for SQL Server Certification Journey to SQLAuthority

  39. I would like to know how to insert a Name with Apostrophe in SQL 2005. I did try this
    INSERT INTO Cust
    (R_CustName)
    VALUES ( ‘My”s”Name’)
    when you insert the result is My’s’Name
    i don’t need the apostrophe after s. how can i remove it while inserting..
    Any idea guys..

    Like

  40. Pingback: SQL SERVER – Clustered Index and Primary Key – Contest Win Joes 2 Pros Combo (USD 198) – Day 3 of 5 « SQL Server Journey with SQL Authority

  41. Pingback: SQL SERVER – Weekly Series – Memory Lane – #042 | Journey to SQL Authority with Pinal Dave

  42. It would be nice to see this article extended to cover BLOB data – varchar(max) and varbinary(max) and how that storage is computed.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s