Question: How to Get Random Records from Table?
Answer: This question seems to be never get old. I have previously answered this question multiple times on my blog, but I still get this question from many readers who prefer sending email over doing searches on Google.
There are many ways to retrieve random records, here is the one method which I often use it as I can remember the syntax of the same easily.
-- First Execution SELECT FirstName, LastName FROM [Person].[Person] ORDER BY CHECKSUM(NEWID());
You can see in the image when I ran the above script multiple times, every single time it gave us different result. This is because we have used the NEWID() in the order by and it brings back different results every single time.
Additionally, please note that above query can be quite expensive if you have tables with hundred thousands of the rows.
Here is the additional link for you to refer to the same subject.
- Retrieving Random Rows from Table Using NEWID()
- Generate A Single Random Number for Range of Rows of Any Table – Very interesting Question from Reader
- SQL SERVER – Random Number Generator Script – SQL Query
Let me know if you are aware of any other method and you believe that is more efficient than the method displayed in this blog post. I will be happy to do another blog post with due credit to you.
Reference: Pinal Dave (https://blog.sqlauthority.com)
If I remove the checksum () function, the result of the query would still be the same. What is the real need to use it?
As the title says – How to Get Random Records from Table
Hello Pinal, I have same question we can generate a unique number for each raw using the NEWID() and if we perform ORDER BY(NEWID()) it always provide a random numbers of record, but why you use Order By(CHECKSUM(NEWID)) , is there some advantages if i use CHECKSUM() with NEWID().
It depends if generating the newid, or sorting the table that is the expensive part. If generating the I’d then you can still get a random ordering from the md5 of the primary (or surrogate) key. This has an advantage of being repeatable, and if you want a run just for your report, add the (milli) second TimeStamp as a salt or offset to an integer column. Note : (binary) checksum is not random enough for this purpose, you have to use something else.
If sorting is the expensive part, then you can get a random cut of the db by asking what the char values of the hash are : eg, keep hashes like ‘AA[0-3]%’ for a 1 in 1024 cut, or if you don’t mind a biased cut, and have an identity column on hand : right(identity, 3)=1 for 1 in 1000
Thanks for your help Andrew.