Every day I spend a good amount of time with different customers helping them with SQL Server Performance Tuning issues. Today we will discuss the question asked by a developer at the organization where I was engaged in Comprehensive Database Performance Health Check. The developers wanted to know if there is any way he can randomly select n rows from a table.
This question is indeed very interesting as so far we have heard that is they is a way to retrieve TOP or BOTTOM rows but in this case, they wanted to retrieve random rows from any particular table. Additionally, they wanted to make sure that whatever the row they select ones, the same set of rows should not be selected next time and the rows should be truly random.
Honestly, it is possible to retrieve random rows from any tables. Let us see a simple example on the AdventureWorks database.
USE AdventureWorks2014 GO SELECT TOP 10 * FROM [Production].[Product] ORDER BY NEWID() GO
When you run the above code every single time you will see a different set of 10 rows. The trick is to add ORDER BY NEWID() to any query and SQL Server will retrieve random rows from that particular table.
I find this method very effective and would love to know your feedback about the same. Do let me know if you use any other trick and I will be happy to blog about retrieving n rows from any table randomly.
Reference: Pinal Dave (http://blog.SQLAuthority.com)
9 Comments. Leave new
Hi
This can also work , if you can figure out how to generate a random number and pass it to the between clause then this can work well
WITH CTE_Random
AS
(SELECT ROW_NUMBER() OVER(ORDER BY ProductID) AS CNT, * FROM production.product )
SELECT * FROM CTE_Random WHERE cnt BETWEEN 300 AND 600
I tested the newid() solution on a large table , first run was 12 seconds and second run was 3 seconds
@julian , I did test your suggestion and always get the same results not random records, is this correct ? hmmm
Hi Carlos
This solution is not 100 % , you have to change the values in the between clause unless you can figure out a way to pass these values automatically, I just did not have the time to work that out
HI Carlos
Try this , you can change the Rand values to what ever you want
DECLARE @random1 int,
@random2 int
SET @random1 = (SELECT FLOOR(RAND()*(50-10+1))+10)
SET @random2 = (SELECT FLOOR(RAND()*(100-10+1))+50)
;
WITH CTE_Random
AS
(SELECT ROW_NUMBER() OVER(ORDER BY ProductID) AS CNT, * FROM production.product )
SELECT * FROM CTE_Random WHERE cnt between @random1 and @random2
How do we use this query in Query Shortcuts. By selecting table name and if I click the shortcut, then it should display n randow rows.
I love your site! You have helped me out so much over the years!!! Thanks!
Let’s say I have 40 records and I used Row_Number /partition by key column
1st set of the key column has 13 records — I need to pick 2 random record from this set
2nd set of the key column has 20 records — I need to pick 5 random record from this set
3rd set of the key column has 7 records — I need to pick 3 random record from this set
is it possible? if yes how?
What happens when it just happens that your random ID assignments are right in the middle of a GAP in the ID? You get nothing back.
I am aware this is 2018 thread.
Latest SQL Server allows you, using TABLESAMPLE option
SELECT * from dbo.FactTable TABLESAMPLE ( 10 PERCENT )
SELECT * from dbo.FactTable TABLESAMPLE ( 10 PERCENT ) REPEATABLE (25)
SELECT * from dbo.FactTable TABLESAMPLE ( 100 ROWS )
SELECT * from dbo.FactTable TABLESAMPLE ( 2000 ROWS ) REPEATABLE (25)
You can also add further conditions using where clause. Say,
SELECT * from dbo.FactTable TABLESAMPLE ( 100 ROWS ) WHERE YEAR(UpdateDT) = 2019