Whenever I am helping my clients with their SQL Server Performance Comprehensive Database Performance Health Check, every single time I get this question. What is the fastest way to retrieve rowcount for a table?
Performance issues most of the time arises when there is a suboptimal code or table is huge and it is properly not maintained. Whenever I ask my clients about the rowcount in the table, they usually run the code something very similar to the following code.
SELECT COUNT(*)
FROM TABLE WITH(NOLOCK)
I think it is very dangerous to run the code like above because it will do the complete table scan to find out the count of the row number. Even though there is nolock hint used it still has to do lots of work just get the rowcount.
Today in this video we will see fastest way tot retrieve rowcount for a table.
As you can see it is pretty simple to find the rowcounts if you use two system tables together. Here is the script which I have used in the video.
Script for RowCount
SELECT SCHEMA_NAME(t.schema_id) SchemaName, t.[name] TableName, SUM(p.row_count) TotalRows FROM sys.tables t INNER JOIN sys.dm_db_partition_stats p ON t.object_id = p.object_id AND t.type_desc = 'USER_TABLE' AND p.index_id IN (0,1) -- WHERE t.[name] = 'Invoices' GROUP BY t.schema_id, t.[name] ORDER BY TotalRows DESC
Of course, the same thing can be done with the help of the SQL Server Management Studio. I will blog about it some times in the future.
Bloopers
If you watch the complete video, at the end of the video, I have included some bloopers. It was my wife who really helped me edit this video and she thought it will be fun to include them. Let me know what you think about them.
Tiny Learning
In the video, there are two tiny learnings. 1) Getting the name of the schema from schema id and 2) Using the column alias in the order by.
Please leave a comment and give suggestions, what would you like to see more in this SQL in the Sixty Seconds series.
Reference: Pinal Dave (http://blog.SQLAuthority.com)
12 Comments. Leave new
Thanks for the above. Can you help with fastest row count with where clause and joins.
Hi Surya,
In that case, you will just have to go with the SELECT COUNT(*) method only.
Bloopers! Love them. Especially when they show problems with pronunciation :)
What about one-table query:
select object_schema_name(object_id), object_name(object_id), sum(rows) totalrows from sys.partitions
where
index_id in(0,1)
and objectproperty(object_id,’isusertable’)=1
group by object_id
order by totalrows desc
This is cheaper in terms IO, queried versus small database:
Table ‘Worktable’. Scan count 0, logical reads 0
Table ‘syssingleobjrefs’. Scan count 633, logical reads 1275
Table ‘sysidxstats’. Scan count 633, logical reads 1294
Table ‘sysrowsets’. Scan count 211, logical reads 2110
Table ‘Workfile’. Scan count 0, logical reads 0
Table ‘sysschobjs’. Scan count 1, logical reads 65
Table ‘syspalnames’. Scan count 1, logical reads 2
vs
Table ‘Worktable’. Scan count 0, logical reads 0
Table ‘Worktable’. Scan count 0, logical reads 0
Table ‘sysrowsets’. Scan count 2, logical reads 20
Great suggestions.
Thanks
A very dangerous recommendation without STRONGLY emphasizing that the correct result is not necessarily returned. The same error applies with NOLOCK !!!! Script below!
Number 1 (—— 1) : creates a database with about 9,000,000 rows.
Number 2 and 3: run simultaneously – in their respective connections – and show problems with NOLOCK. Only update is performed, so there will always be the same number of rows in the table and the sum should be the same!
Resultat
MinCount MaxCount DiffLowHigh DiffCounts NumberOfTest CorrectCount
8682209 8777781 95572 78 100 8682209
MinSum MaxSum DiffSum DiffNumbers NumberOfTest CorrectSum
3961977701 4012734905 50757204 78 100 3961977701
Numbers 4 and 5: also run simultaneously and show the problem of using the sys tables instead of ‘doing it right’. Since there are insertions and deletions that are not committed but rolled back, the calculated number of rows in the table will be different.
Result
MinCount MaxCount DiffLowHigh DiffNumbers NumbersOfTests CorrectNumber
8666237 8682218 15981 1174 10000 8682209
So very dangerous!!!
Scripts!
—— 1 Generate testdata
CREATE TABLE dbo.t
(
Id INT NOT NULL
CONSTRAINT t_PK PRIMARY KEY NONCLUSTERED,
F1 INT NOT NULL INDEX CL_t_F1 CLUSTERED,
F2 VARCHAR(4000) NULL,
F3 INT NOT NULL
);
GO
INSERT INTO dbo.t(Id, F1, F2, F3) VALUES
(1, 662, ‘A’, 2),
(2, 214, ‘A’, 2345),
(3, 119, ‘A’, 94),
(4, 443, ‘A’, 8),
(5, 615, ‘A’, 1255),
(6, 336, ‘A’, 1),
(7, 211, ‘A’, 345),
(8, 118, ‘A’, 12),
(9, 45, ‘A’, 45);
GO
SET NOCOUNT ON;
INSERT INTO dbo.t (Id, F1, F2, F3)
SELECT Id + (SELECT MAX(Id) FROM dbo.t),
(CAST(F1 AS BIGINT) * F1 * Id) % 352 ,
REPLICATE(LEFT(F2, 2), ((DATEPART(MS, SYSDATETIME()) + 10) % 2000)),
F3
FROM dbo.t;
GO 20
DELETE
FROM dbo.t
WHERE Id % 100 IN (7, 23, 26, 27, 28, 44, 78, 99);
GO
—– 2 Modify with UPDATE on text-column
SELECT COUNT(*)
FROM dbo.t;
GO
SET NOCOUNT OFF;
UPDATE dbo.t
SET F1 = CAST(F1 AS BIGINT) * DATEPART(MS, SYSDATETIME()) % (DATEPART(NS, SYSDATETIME()) + 2),
F2 = REPLICATE(LEFT(F2, 4), ((DATEPART(MS, SYSDATETIME()) + 99) % 800))
WHERE Id % 17 IN (2, 6, 3);
UPDATE dbo.t
SET F1 = CAST(F1 AS BIGINT) * DATEPART(NS, SYSDATETIME()) % (DATEPART(MS, SYSDATETIME()) + 4),
F2 = REPLICATE(LEFT(F2, 4), ((DATEPART(MS, SYSDATETIME()) + 123) % 560))
WHERE Id % 21 IN (3, 5, 12, 13, 19, 20);
UPDATE dbo.t
SET F1 = CAST(F1 AS BIGINT) * DATEPART(MS, SYSDATETIME()) % (DATEPART(NS, SYSDATETIME()) + 2),
F2 = REPLICATE(LEFT(F2, 8), ((DATEPART(MS, SYSDATETIME()) + 67) % 200))
WHERE Id % 9 IN (1, 3, 4);
GO
—— 3 COUNT and SUM with NOCOUNT
CREATE TABLE dbo.TableHintCount
(
ID INT NOT NULL IDENTITY,
Number INT NOT NULL,
SumF3 BIGINT NOT NULL
);
GO
INSERT INTO dbo.TableHintCount(Number, SumF3)
SELECT COUNT(*),
SUM(CAST(F3 AS BIGINT))
FROM dbo.t WITH (NOLOCK);
GO 100
SELECT MIN(Number) AS MinCount,
MAX(Number) AS MaxCount,
MAX(Number) – MIN(Number) AS DiffLowHigh,
COUNT(DISTINCT Number) AS DiffCounts,
COUNT(*) AS NumberOfTest,
(SELECT COUNT(*) FROM dbo.t) AS CorrectCount,
MIN(Number) – (SELECT COUNT(*) FROM dbo.t) AS MinMinusActuel,
MAX(Number) – (SELECT COUNT(*) FROM dbo.t) AS MaxMinusActuel,
‘NOLOCK’ AS TestType
FROM dbo.TableHintCount;
SELECT MIN(SumF3) AS MinSum,
MAX(SumF3) AS MaxSum,
MAX(SumF3) – MIN(SumF3) AS DiffSum,
COUNT(DISTINCT SumF3) AS DiffNumbers,
COUNT(*) AS NumberOfTest,
(SELECT SUM(CAST(F3 AS BIGINT)) FROM dbo.t) AS CorrectSum,
MIN(SumF3) – (SELECT SUM(CAST(F3 AS BIGINT)) FROM dbo.t) AS MinMinusActuel,
MAX(SumF3) – (SELECT SUM(CAST(F3 AS BIGINT)) FROM dbo.t) AS MaxMinusActuel,
‘NOLOCK’ AS TestType
FROM dbo.TableHintCount;
GO
—— 4 DELETE and INSERT with ROLLBACK
BEGIN TRANSACTION;
DELETE
FROM dbo.t
WHERE Id % 6500 BETWEEN DATEPART(SECOND, SYSDATETIME()) AND DATEPART(SECOND, SYSDATETIME()) + 10;
WAITFOR DELAY ’00:00:01′;
ROLLBACK TRANSACTION;
BEGIN TRANSACTION;
INSERT INTO dbo.t(Id, F1, F2, F3) VALUES
(20000001, 165, ‘X’, 112),
(20000002, 532, ‘X’, 235),
(20000003, 122, ‘X’, 942),
(20000004, 543, ‘X’, 991),
(20000005, 341, ‘X’, 3412),
(20000006, 423, ‘X’, 1555),
(20000007, 134, ‘X’, 267),
(20000008, 222, ‘X’, 666),
(20000009, 456, ‘X’, 415);
WAITFOR DELAY ’00:00:01′;
ROLLBACK TRANSACTION;
GO 50
—— 5 Use systemtables instead of COUNT(*)
CREATE TABLE dbo.SysTablesCount
(
ID INT NOT NULL IDENTITY,
Number INT NOT NULL,
);
GO
INSERT INTO dbo.SysTablesCount(Number)
SELECT SUM(p.row_count) AS TotalRows
FROM sys.tables AS t INNER JOIN sys.dm_db_partition_stats AS p
ON t.object_id = p.object_id
WHERE t.name = ‘t’ AND
t.type_desc = ‘USER_TABLE’ AND
p.index_id IN (0, 1);
GO 10000
SELECT MIN(Number) AS MinCount,
MAX(Number) AS MaxCount,
MAX(Number) – MIN(Number) AS DiffLowHigh,
COUNT(DISTINCT Number) AS DiffNumbers,
COUNT(*) AS NumbersOfTests,
(SELECT COUNT(*) FROM dbo.t) AS CorrectNumber,
MIN(Number) – (SELECT COUNT(*) FROM dbo.t) AS MinimumMinusActuel,
MAX(Number) – (SELECT COUNT(*) FROM dbo.t) AS MaximumMinusActuel,
‘SYS-tables’ AS Testtype
FROM dbo.SysTablesCount;
SELECT DISTINCT Number
FROM dbo.SysTablesCount
ORDER BY Number;
I’ve had problems with the functions like SCHEMA_NAME and OBJECT_NAME locking during dictionary operations like ALTER, etc. I always use sys.schemas instead. I don’t know if it’s safer.
I reckon SELECT COUNT(0) would be slightly faster than SELECT COUNT(*)
The suggestion is to not use SELECT COUNT(*) at all.
In the last many versions, SQL Server has found the best index for SELECT COUNT (*). It is not better to write SELECT COUNT (constant) or a SELECT COUNT (PK). SQL Server will always find the smallest index to solve the problem.
Try to form a table without an index, but with many columns. Create an index on a column – e.g. a varchar (50) with, for example, an average of 20 characters. SQL Server will now use this index instead of tablescan. Create a new index – e.g. a column defined as SMALLINT – and SQL Server will use this index. So unless there is no index at all on the table, SQL Server will never do a table scan, but always index scan.
Even if you type SELECT COUNT(Adress), SQL Server will use a smaller index if the Address column is defined with NOT NULL.
So use SELECT COUNT(*) to tell people, who looks at the statemnet, that you wants to count number of rows in the table!!!!!!
SQL Server manages the optimization.
What about sp_Spaceused?
It will give you the result for a single table only.