How to Find Cardinality or Uniqueness for any Column? – Interview Question of the Week #177

June 10, 2018

Question: How to Find Cardinality or Uniqueness for any Column?

How to Find Cardinality or Uniqueness for any Column? - Interview Question of the Week #177 cardinalityofcolumn1

Answer: The question was actually came up during my training SQL Server Performance Tuning Practical Workshop, while I was explaining various concepts related to cardinality estimation, compatibility level and its impact on SQL Server’s performance.

Though the question is very simple, it is very valid as well. It is not possible that every single person knows the meaning of cardinality.

Let us see a very simple script which gives us cardinality in percentage from the AdventureWorks database table SalesOrderDetail and column SalesOrderID.

SELECT COUNT(DISTINCT SalesOrderID)*100.0/Count(*) 'Distinct_SalesOrderID (in %)'
FROM [AdventureWorks2014].[Sales].[SalesOrderDetail]

If you see the resultsets, it shows how many percentages of unique values are there for the column SalesOrderID details in the table.

How to Find Cardinality or Uniqueness for any Column? - Interview Question of the Week #177 cardinalityofcolumn-800x125

It is very important to understand the uniqueness as well as the datatype of the column and how it is used in SQL Server. They play very important and crucial role in creating indexes as well as building execution plan, however, I have often noticed that many performance tuning experts either do not know about it or do not talk about it during their training.

If you sign up for my SQL Server Performance Tuning Practical Workshop, I make sure that I explain you in detail with a real-world example.

Reference: Pinal Dave (https://blog.sqlauthority.com)

SQL Index, SQL Performance, SQL Scripts, SQL Server

How to Enable Files & Folders Level Auditing in SQL Server? – Interview Question of the Week #176

How Much Work Each Processor (CPU) is Doing in SQL Server? – Interview Question of the Week #178

1 Comment. Leave new

Tom Wickerath
August 7, 2018 8:31 am
Hi Pinal,
I’m having trouble understanding the true meaning of the single cardinality percent number (25.936% in your example). I also don’t have a copy of AdventureWorks handy, so I created a simple example:
Create database NutriciousFoods;
go
Use NutriciousFoods
go
create table Foods
(
product varchar(50) not null primary key,
category varchar(50) not null
);
create index idxCategory on Foods (category);
insert into Foods (product, category) values (‘Tomato’, ‘Vegetable’);
insert into Foods (product, category) values (‘Broccoli’, ‘Vegetable’);
insert into Foods (product, category) values (‘Celery’, ‘Vegetable’);
insert into Foods (product, category) values (‘Lettuce’, ‘Vegetable’);
insert into Foods (product, category) values (‘Spinach’, ‘Vegetable’);
insert into Foods (product, category) values (‘Arugula’, ‘Vegetable’);
insert into Foods (product, category) values (‘Apple’, ‘Fruit’);
insert into Foods (product, category) values (‘Pear’, ‘Fruit’);
insert into Foods (product, category) values (‘Banana’, ‘Fruit’);
insert into Foods (product, category) values (‘Grape’, ‘Fruit’);
go
— Look at data
select * from foods
order by category, product;
— Determine Cardinality or Uniqueness using Pinal’s formula:
SELECT COUNT(DISTINCT category)*100.0/Count(*) ‘Distinct_Category (in %)’
FROM dbo.Foods;
–> Result = 20%
Question # 1) When considering the above data, what does this 20% refer to?
I created the above table in Oracle, insert the same data, and then use the ratio_to_report method, I get values that makes perfect sense:
select category, (ratio_to_report(count(*)) over() * 100) as “PctTotal”
from foods
group by category
order by category;
–> Category PctTotal
Fruit 40%
Vegetable 60%
This type of information seems more useful, where one gets a percentage for each unique item. I believe this type of information is useful to help troubleshoot parameter sniffing issues.
Question # 2) What is the easiest way in SQL Server to get the same type of information that Oracle’s ratio_to_report displays? I was able to create a query that uses two Common Table Expressions, but I’m wondering if there might be an easier way to get the same result:
with v
as (
select distinct category, Cast(count(*) as decimal) As CategoryCount
FROM dbo.Foods
group by category
),
t
as (
select count(*) as TotalRows FROM dbo.Foods
)
select category, (CategoryCount*100 /TotalRows) as PctTotal
from v, t
order by PctTotal;
Thank You,
Tom
Reply

July 2025 Discount: Comprehensive Database Performance Health Check | Testimonials

How to Find Cardinality or Uniqueness for any Column? – Interview Question of the Week #177

Related Posts

SQL SERVER – Patch Install Rule Error – Not Clustered or the Cluster Service is Up and Online

SQL SERVER – Change Fill Factor – SQL in Sixty Seconds #069

SQL SERVER – Rule Windows Server 2003 FILESTREAM Hotfix Check failed on Windows 2012 R2 Cluster

1 Comment. Leave new

Leave a ReplyCancel reply