SQL SERVER – Remove Duplicate Entry from Comma Delimited String – UDF

I love reader’s contribution this blog as that brings variety in articles. I encourage my readers to provide their contribution and I will publish then with their name.

Blog Reader Ashish Jain has posted very simple script which will remove duplicate entry from comma delimited string. User Defined Function has very simple logic behind it. It takes comma delimited string and then converts it to table and runs DISTINCT operation on the table. DISTINCT operation removes duplicate value. After that it converts the table again into the string and it can be used.

I have modified original contribution from Ashish so now it completely covers the subject intended to cover. I would suggest that this UDF should be kept handy to perform this tedious task easily.

CREATE FUNCTION dbo.DistinctList
(
@List VARCHAR(MAX),
@Delim CHAR
)
RETURNS
VARCHAR
(MAX)
AS
BEGIN
DECLARE
@ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE @list1 VARCHAR(MAX), @Pos INT, @rList VARCHAR(MAX)
SET @list = LTRIM(RTRIM(@list)) + @Delim
SET @pos = CHARINDEX(@delim, @list, 1)
WHILE @pos > 0
BEGIN
SET
@list1 = LTRIM(RTRIM(LEFT(@list, @pos - 1)))
IF @list1 <> ''
INSERT INTO @ParsedList VALUES (CAST(@list1 AS VARCHAR(MAX)))
SET @list = SUBSTRING(@list, @pos+1, LEN(@list))
SET @pos = CHARINDEX(@delim, @list, 1)
END
SELECT
@rlist = COALESCE(@rlist+',','') + item
FROM (SELECT DISTINCT Item FROM @ParsedList) t
RETURN @rlist
END
GO
SELECT dbo.DistinctList('342,34,456,34,3454,456,aa,bb,cc,aa',',') DistinctList
GO

I encourage my readers to send their contribution as well so I can include their contribution as well.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

About these ads

16 thoughts on “SQL SERVER – Remove Duplicate Entry from Comma Delimited String – UDF

  1. Is there a reason that
    SELECT @rlist = COALESCE(@rlist+‘,’,”) + item

    is not

    SELECT @rlist = COALESCE(@rlist+@Delim,”) + item

  2. What should have probably been done here is to use a CLR function, as CLR routines are much better at procedural code than T-SQL.

    This could have been done in a few lines of code, and performed much better as well.

  3. Hi,

    I m wroking in MNC as SQL developer i required dis function but i m not getting it so can u help me out to get dis UDF and also wat it actually does.

  4. Hi Guys,

    Try out this simple solution.

    CREATE FUNCTION DBO.RECU (@DATA NVARCHAR(1000))
    RETURNS @RETTABLE TABLE
    ( COL1 NVARCHAR(200))
    AS
    BEGIN
    IF LEN(@DATA) > 0 AND CHARINDEX (‘,’,@DATA) > 0
    BEGIN
    INSERT INTO @RETTABLE
    SELECT LEFT(@DATA,CHARINDEX (‘,’,@DATA)-1) AS COL1
    UNION ALL
    SELECT COL1 FROM DBO.RECU(RIGHT(@DATA,LEN(@DATA)-CHARINDEX (‘,’,@DATA)))
    END
    RETURN
    END

    SELECT DISTINCT COL1 FROM DBO.RECU(‘342,34,456,34,3454,456,aa,bb,cc,aa,’)

    Thanks and Regards
    Muralikrishnan

  5. Dave,
    I am impressed with your blog pages. There is lot of materail and it is really helpful for SQL pro’s.

    I have a requirement and wanted to know if you or anyone can help…

    Table1

    ID (Pkey) Column2 (Account numbers)

    1001 U987654321
    1002 123213121,3434334,881216665
    1003 A123456789,C123456789
    1004 U1818012, U808561773……

    Column 1 is a unique ID number which is the Pkey. Column2 holds (Accoount numbers) and can have upto 12 account numbers. My requirement is convert them in to table like ….

    1001 U987654321
    1002 123213121
    1002 3434334
    1002 881216665
    1003 A123456789
    1003 C123456789
    1004 U808561773
    1004 U808561773
    ……..

    Pls let me know how to accomplish this using SQL 2008.

    Thanks,
    Hari

  6. @Hari,

    How about using While loop or cursor, Because I dont know any other simple method to solve this.

    1. Write a while loop,
    2. Every time assign Column2 value to a variable.
    3. use another while loop and Use substring function to reduce the size of variable and each time you reduce the size you put value into a temporary table.
    4. At the end of while loop, do a select * from temp table.

    ~ IM.

    • Hi ,i am Aluri from bangalore and i am a dot net developer ,i hv a requirement that i couldn’t solve,hope you can solve my problem:
      well my requirement is to count duplicate words in a sentense which is stored in a column example:
      1. I am A software engineer.I am a dot net developer.
      that is one record,what i need is,to count homany times repeted values in that record ,and the result would be like this:
      I am (2)
      Software(2)
      like that,i realy tried many string functions but faild to get the right result,hope you can help me

      my mail id is: [email id removed]
      thank you

      Aluri

  7. I was looking for a way to do this and found your blog. The function worked great! So THANKS for that!

    While I was searching for how to do something totally different, however, I actually revisited the dedupe problem and came up with the following:

    declare @list nvarchar(max)
    set @list = ‘342,34,456,34,3454,456,aa,bb,cc,aa’
    declare @x xml
    declare @delim nvarchar(1)
    set @delim = ‘,’

    set @x = ”+replace(lower(@list),@delim,”)+”
    select replace(cast(@x.query(‘distinct-values(data(/list/delim))’) as varchar(max)),’ ‘,@delim) as result

    Basically I turn the delimited list in to XML and use ‘distinct-values’ to get rid of dupes.

    One gotcha to keep in mind is ‘distinct-values’ appears to be an exact match comparison and is why I included the ‘lower()’ command. So if @list = ”342,34,456,34,3454,456,aa,bb,cc,aa,AA’ AND you are not able to leave the ‘lower()’ command in there, you’d get both ‘aa’ and ‘AA’ in the result.

    Enjoy!

  8. Thanks for your post, how do you keep the same order in the comma delimited list? After using dbo.DistintList() it changes the order to SQL ordering, meaning numbers first then letters. I am trying to take individual rows of products and create the size/color array.

    XS,S,M,L,XL,2XL,3XL turns into an output of 2XL,3XL,L,M,S,XS.

    I need these to be in size order (S to 3XL). I have another column for SortPriority but this is not obeyed after running the function.

  9. Use this one to split string

    DECLARE @str varchar(100),@sep VARCHAR(10)
    select @str = ‘1,2,3,4’ ,@sep = ‘,’
    ;
    WITH Split(stpos,Endpos)
    AS( SELECT 0 stpos
    ,CHARINDEX(@sep,@str) Endpos
    UNION ALL
    SELECT Endpos+1
    ,CHARINDEX(@sep,@str,Endpos+1)
    FROM split
    WHERE Endpos > 0
    )
    SELECT SUBSTRING(@str,stpos,COALESCE(NULLIF(Endpos,0),LEN(@str)+1)-stpos)
    FROM split

  10. CREATE FUNCTION DBO.DISTINCT_MSG_SEQ
    (
    @LIST VARCHAR(MAX)
    )
    RETURNS VARCHAR(MAX)
    AS
    BEGIN
    DECLARE @DISTINCT_MSGLIST TABLE
    (
    S_NO INT IDENTITY(1,1),
    ITEM VARCHAR(MAX)
    )
    DECLARE @LIST1 VARCHAR(MAX),
    @POS INT,
    @RLIST VARCHAR(MAX),
    @MAX VARCHAR(MAX),
    @DELIM VARCHAR
    SET @DELIM = ‘,’
    SET @LIST = LTRIM(RTRIM(@LIST)) + @DELIM
    SET @POS = CHARINDEX(@DELIM, @LIST, 1)
    SET @LIST1 = LTRIM(RTRIM(LEFT(@LIST, @POS – 1)))

    INSERT INTO @DISTINCT_MSGLIST VALUES (CAST(@LIST1 AS VARCHAR(MAX)))
    SET @LIST = SUBSTRING(@LIST, @POS+1, LEN(@LIST))
    SET @POS = CHARINDEX(@DELIM, @LIST, 1)
    WHILE @POS > 0
    BEGIN
    SET @LIST1 = LTRIM(RTRIM(LEFT(@LIST, @POS – 1)))
    IF @LIST1 ”
    SELECT @MAX = ITEM FROM @DISTINCT_MSGLIST WHERE S_NO = ISNULL((SELECT MAX(S_NO) FROM @DISTINCT_MSGLIST) ,0)
    IF @MAX @LIST1
    INSERT INTO @DISTINCT_MSGLIST VALUES (CAST(@LIST1 AS VARCHAR(MAX)))
    SET @LIST = SUBSTRING(@LIST, @POS+1, LEN(@LIST))
    SET @POS = CHARINDEX(@DELIM, @LIST, 1)
    END
    SELECT @RLIST = COALESCE(@RLIST+’,’,”) + ITEM
    FROM (SELECT ITEM FROM @DISTINCT_MSGLIST) T

    RETURN @RLIST

    END

    The above function will display the unique msg sequence from the list

    eg : – If you pass the string ‘A,A,B,B,A,B’ , this function gives the output as ‘A,B,A,B’

  11. i have problem in solving the following case.can any one help regarding this one

    Input = ‘A:1, 1, 2, 4 B:3, 5, 5, 6 C:7, 7, 9, 4′

    Required Output = ‘A:1, 2, 4 B:3, 5, 6 C:4, 7, 9′

  12. Good idea and code .. Just one problem though: .. having dynamic field names in the query, there are cases like “ISNULL(, ”)” – which the funtion will split by “,” ..

  13. Pingback: SQL SERVER – Weekly Series – Memory Lane – #012 « SQL Server Journey with SQL Authority

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s