# SQL SERVER – Simple Explanation and Puzzle with SOUNDEX Function and DIFFERENCE Function

Earlier this week I asked a question where I asked how to Swap Values of the column without using CASE Statement. Read here: A Puzzle – Swap Value of Column Without Case Statement,there were more than 50 solutions proposed in the comment. There were many creative solutions. I have mentioned my personal favorite (different ones) here: Solution of Puzzle – Swap Value of Column Without Case Statement.

However, I received lots of questions regarding one of the Solution by SIJIN KUMAR V P. He has used the function SOUNDEX in his solution. The request was to explain how SOUNDEX and DIFFERENCE works. Well, there are pretty decent documentations provided over here SOUNDEX function and DIFFERENCE over on MSDN and if I attempt to explain this function I will end up writing the same details which are available on MSDN. Instead of writing theory, we will try to learn this function by using a couple of simple puzzles. You try to solve the puzzles using the MSDN and see if you can learn something very quickly.

In simple words – SOUNDEX converts an alphanumeric string to a four-character code to find similar-sounding words or names. The first character of the code is the first character of character_expression and the second through fourth characters of the code are numbers that represent the letters in the expression. Vowels incharacter_expression are ignored unless they are the first letter of the string. DIFFERENCE function returns an integer value. The  integer returned is the number of characters in the SOUNDEX values that are the same. The return value ranges from 0 through 4: 0 indicates weak or no similarity, and 4 indicates strong similarity or the same values.

Learning Puzzle 1:

Now let us run following four queries and observe its output.

`SELECT SOUNDEX('SQLAuthority') SdxValue SELECT SOUNDEX('SLTR') SdxValue SELECT SOUNDEX('SaLaTaRa') SdxValue SELECT SOUNDEX('SaLaTaRaM') SdxValue `

When you look at the result set all the four values are same.

The reason for all the values to be same is as for SQL Server SOUNDEX function all the four strings are similarly sounding string.

Learning Puzzle 2:

Now let us run following five queries and observe its output.

`SELECT DIFFERENCE (SOUNDEX('SLTR'),SOUNDEX('SQLAuthority')) SELECT DIFFERENCE (SOUNDEX('TH'),SOUNDEX('SQLAuthority')) SELECT DIFFERENCE ('SQLAuthority',SOUNDEX('SQLAuthority')) SELECT DIFFERENCE ('SLTR',SOUNDEX('SQLAuthority')) SELECT DIFFERENCE ('SLTR','SQLAuthority')`

When you look at the result set you will get the result in the ranges from 1 to 4.

Here is how it works if your result is 0 which means absolutely not relevant to each other and if your result is 1 which means the results are relevant to each other.

Have you ever used above two functions in your business need or on production server? If yes, would you please leave a comment with use cases. I believe it will be beneficial to everyone.

Reference: Pinal Dave (https://blog.sqlauthority.com)

#### Related Posts

• great job buddy nice post

• In my machine,value of SELECT SOUNDEX(‘SQLAuthority’) SdxValue is S243 and rest are S436.WHy ?

I think we can use SOUNDEX and DIFFERENCE in search engine.There can be value in database which “Like operator” ignore,in that scenerio we can use SOUNDEX and DIFFERENCE.

For example, if i search “Lebron Jams” and the database has values “Lebron James”, “Lebron Jim”, “Michael James”, etc , the results that system returns should be ordered by similarity between strings, i.e, the results should be “Lebron James”, “Lebron Jim” and “Michael James”.

It is hard to implement this by using sql’s ‘like’.

• I just do a test and found that value of SELECT SOUNDEX(‘Michael James’) in SQL SERVER DB is M240, but why the value in ORACLE DB is M242

• Like Kumar stated, I have the SOUNDEX() function running on a production website. It is used in a public search function for people’s names. Works very well and was a breeze to implement once I knew of its existence. Would have been a big hassle to implement otherwise.

• Nice, your posts are very informative. Kudos to the your course on Pluralsight. Looking forward for more videos,posts and insights on database design patterns

• Dear Pinal,

first, try to run these sql statement
SELECT
SOUNDEX(‘joanharimau’),
SOUNDEX(‘jhonsayang’),
DIFFERENCE(SOUNDEX(‘joanharimau’),SOUNDEX(‘jhonsayang’)),
DIFFERENCE(‘joanharimau’,’jhonsayang’)

the value will return
filed1 | field2 | field3 | field4
J565 J525 4 3

DIFFERENCE can only compare character expression. not soundex value.

and the return value of DIFFERENCE is also incorrect.
The return value ranges from 0 through 4: 0 indicates weak or no similarity, and 4 indicates strong similarity or the same values.

https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms188753(v=sql.105)

https://support.microsoft.com/en-us/help/100365

• Pinal,
I’m trying to understand how the DIFFERENCE function works. I am seeing strange results.

DECLARE @t VARCHAR(20), @s VARCHAR(20)
SET @t = ‘SDRF’
SET @s = ‘FRDS’

SELECT SOUNDEX(@t), SOUNDEX(@s),DIFFERENCE(@t,@s)

Gives:
S361, F632, 1 Yet, neither soundex value has a common digit in sequence.

DECLARE @t VARCHAR(20), @s VARCHAR(20)
SET @t = ‘SDRF’
SET @s = ‘SFRD’

SELECT SOUNDEX(@t), SOUNDEX(@s),DIFFERENCE(@t,@s)

Gives:
S361, S163, 2 Which makes sense, since they have the S and 6 in common.

DECLARE @t VARCHAR(20), @s VARCHAR(20)
SET @t = ‘SDRF’
SET @s = ‘SFFD’

SELECT SOUNDEX(@t), SOUNDEX(@s),DIFFERENCE(@t,@s)

Gives:
S361, S130, 2 First digit in common, but no others. ????

DECLARE @t VARCHAR(20), @s VARCHAR(20)
SET @t = ‘HELPER’
SET @s = ‘QUIZ’

SELECT SOUNDEX(@t), SOUNDEX(@s),DIFFERENCE(@t,@s)

Gives:
H416, Q200, 0 Which is what one would expect.

I’m confused by the less than consistent results. I’m sure I’m missing some key piece in the puzzle. Do you understand how this function works?