Some puzzles come as legacy and a great way to do sometimes interviews. In the past, I have written some innovative solutions when it comes to splitting strings inside SQL Server. One of them is at: SQL SERVER – Split Comma Separated List Without Using a Function. Though this is a good solution, many have some back to say this is not a clean or a native way to work with strings. Let us learn about Splitting Strings puzzle.
Let me complicate this statement a bit and set the stage for you. Assume you have a column that has a delimited value inside it. Our requirement is to split the values and find out the count of values. So I am making the split string conversation into an aggregation too. Nothing fancy about this though.
Splitting Strings Puzzle
Let us set up the environment for testing.
USE tempdb GO -- Create a table CREATE TABLE SplitWithCount ( cols VARCHAR(100) ) GO -- Insert Some Value INSERT INTO SplitWithCount (cols) VALUES ('Pinal;Dave;Kumar'),('Pinal;Dave'),('Pinal') GO -- Retrieve data SELECT * FROM SplitWithCount GO
What we want to achieve is to get an output as shown below:
The puzzle here is to see who can bring a simple code to achieve this above requirement.
Welcome 2016:
Interestingly, we have had a note around using STRING_SPLIT function inside SQL Server and it has been documented here: SQL SERVER 2016 – STRING_SPLIT Function Performance Comparison – Notes from the Field #118. If you are on SQL Server 2016 then we can easily do this using the following statement.
SELECT ss.value AS SplitMyString, COUNT(ss.value) [Counts] FROM SplitWithCount CROSS APPLY STRING_SPLIT(cols,';') ss GROUP BY ss.value GO
Though this is a very elegant way to do inside SQL Server 2016, I don’t think we can even beat the simplicity of this solution. Having said that if you are on an earlier version, would love to see what solutions you can come up with. Let me know. If you go through my earlier blog this would look like:
SELECT SplitMyString, COUNT(*) Counts FROM ( SELECT CAST(CONCAT('<Cols>' , REPLACE(cols, ';' , '</Cols><Cols>' ) , '</Cols>' ) AS XML) Xmlcol FROM SplitWithCount ) s CROSS apply ( SELECT ProjectData.D.value('.', 'VARCHAR(10)') as SplitMyString FROM s.xmlcol.nodes('Cols') as ProjectData(D) )a GROUP BY a.SplitMyString
Leave a comment if you are aware of alternate solutions. I will be happy to publish that on blog with due credit.
Reference: Pinal Dave (https://blog.sqlauthority.com)
47 Comments. Leave new
DECLARE @str VARCHAR(max)
SELECT @str = CONCAT (Coalesce(@str + ‘;’, ”) , cols)
FROM SplitWithCount
SELECT Items [cols]
, Count(1) [Count]
FROM dbo.Split(@str, ‘;’)
GROUP BY Items
—
/*
SELECT * FROM dbo.[Split](‘5,4,3,’,’,’)
*/
CREATE FUNCTION [dbo].[Split]
(
@RowData nvarchar(2000),
@SplitOn nvarchar(5)
)
RETURNS @RtnValue table
(
Id int identity(1,1),
Data nvarchar(100)
)
AS
BEGIN
Declare @Cnt int
Set @Cnt = 1
While (Charindex(@SplitOn,@RowData)>0)
Begin
Insert Into @RtnValue (data)
Select
Data = ltrim(rtrim(Substring(@RowData,1,Charindex(@SplitOn,@RowData)-1)))
Set @RowData = Substring(@RowData,Charindex(@SplitOn,@RowData)+1,len(@RowData))
Set @Cnt = @Cnt + 1
End
Insert Into @RtnValue (data)
Select Data = ltrim(rtrim(@RowData))
Return
END
GO
DECLARE @Table AS TABLE
(
RowIndex INT IDENTITY(1,1)
,Data VARCHAR(100)
)
INSERT INTO @Table
SELECT *
FROM SplitWithCount
DECLARE @RowCount INT
DECLARE @RowIndex INT
SET @RowIndex =1
SELECT @RowCount = COUNT(*)
FROM @Table
DECLARE @FinalTable AS TABLE
(
RowIndex INT IDENTITY(1,1),
SplitMyString VARCHAR(50)
)
WHILE (@RowIndex <=@RowCount)
BEGIN
DECLARE @TableString VARCHAR(100)
SELECT @TableString =Data
FROM @Table
WHERE RowIndex = @RowIndex
INSERT INTO @FinalTable
SELECT DATA FROM dbo.split(@TableString,';')
SET @RowIndex = @RowIndex + 1
END
SELECT SplitMyString , COUNT(SplitMyString) FROM @FinalTable
GROUP BY SplitMyString
Concat functions works from SQLSERVER 2012.So we have to use ‘+’ operator for the versions below 2012.
Yes
;WITH CTE ( SplitMyString, cols)
AS
(
SELECT LEFT(cols, CHARINDEX(‘;’, cols+’;’) – 1) ,SUBSTRING(cols, CHARINDEX(‘;’, cols+’;’) + 1, LEN(cols))
FROM SplitWithCount t
UNION ALL
SELECT LEFT(cols, CHARINDEX(‘;’, cols+’;’) – 1) ,SUBSTRING(cols, CHARINDEX(‘;’, cols+’;’) + 1, LEN(cols))
FROM CTE WHERE LEN (cols) > 0
)
select SplitMyString,COUNT(SplitMyString) AS Counts from CTE GROUP BY SplitMyString
Very Good Harsh :)
I think this is simplest!
SELECT SplitMyString,Count(*) Counts FROM ( SELECT
LTRIM(RTRIM(m.n.value(‘.[1]’,’varchar(8000)’))) AS SplitMyString
FROM
(SELECT CAST(” + REPLACE(cols,’;’,”) + ” AS XML) AS x FROM SplitWithCount)t CROSS APPLY x.nodes(‘/XMLRoot/R’)m(n) ) M GROUP BY SplitMyString
;WITH splitter_cte AS (
SELECT CHARINDEX(‘;’, cols) as pos, 0 as lastPos, cols, SUBSTRING(cols,0, CASE WHEN CHARINDEX(‘;’, cols) = 0 THEN LEN(cols) + 1 ELSE CHARINDEX(‘;’, cols) END) nm from SplitWithCount
union all
select CHARINDEX(‘;’, cols, pos + 1), pos, cols, SUBSTRING(cols,pos + 1, CASE WHEN CHARINDEX(‘;’, cols, pos + 1) = 0 THEN LEN(cols) – POS ELSE CHARINDEX(‘;’, cols, pos + 1) – POS – 1 END )
from splitter_cte
where pos > 0
)
SELECT nm, count(nm) cnt
FROM splitter_cte
group by nm
Declare @str Varchar(5000), @Pos int
select @str = STUFF((select cols +’;’ from Splitwithcount For xml path(”)),1,0,”)
create table #tmp (Col varchar(100))
SET @pos = CHARINDEX(‘;’, @str, 0)
WHILE (@pos != 0)
BEGIN
INSERT INTO #tmp
VALUES (SUBSTRING(@str, 0, @pos))
Select @str = SUBSTRING(@str, @pos+1, LEN(@str)-@pos)
select @pos = CHARINDEX(‘;’, @str, 0)
END
select Col,count(1)
from #tmp
group by Col
drop table #tmp
Select Col1, Count(1) From
(SELECT LTRIM(RTRIM(Data.nod.value(‘.[1]’,’varchar(8000)’))) as Col1
FROM
(
SELECT CAST(” + REPLACE(Cols,’;’,”) + ” AS XML) XmlSplit
FROM SplitWithCount
)A
CROSS APPLY XmlSplit.nodes(‘/Root/Data’)Data(nod)
) Final group by Col1
select cols, count(cols )from (
SELECT
LTRIM(RTRIM(m.n.value(‘.[1]’,’varchar(8000)’))) AS cols
FROM
(
SELECT CAST(‘‘ + REPLACE(cols,’;’,’‘) + ‘‘ AS XML) AS x
FROM SplitWithCount
)t
CROSS APPLY x.nodes(‘/a/b’)m(n) ) a
group by cols
By replacing “;” with xml tag you can achieve the result. Here is the script.
SELECT f.YourName, COUNT(f.MyName) AS YourCount FROM
(
SELECT t.c.value(‘.’, ‘nvarchar(100)’) as YourName FROM
(
SELECT CAST(” + REPLACE(cols, ‘;’,”) + ” as XML) AS ColXml FROM SplitWithCount
) k
CROSS APPLY ColXml.nodes(‘/p’) as t(c)
) f
GROUP BY f.MyName
Result
======
YourName YourCount
———————————
Dave 2
Kumar 1
Pinal 3
DECLARE @tbl TABLE (col VARCHAR(256))
INSERT INTO @tbl
VALUES(‘PINAL;DAVE;KUMAR’),(‘PINAL;DAVE’),(‘PINAL’)
SELECT * FROM @tbl
;WITH CTE
AS(
SELECT 1 AS NUMBER
UNION ALL
SELECT
NUMBER + 1
FROM
CTE
WHERE
NUMBER <= 100
)
SELECT * INTO #NUMBER FROM CTE OPTION(MAXRECURSION 100)
SELECT
SplitMyString,
COUNT(SplitMyString)
FROM
(
SELECT
SUBSTRING(col,NUMBER,EndPt – NUMBER) AS SplitMyString
FROM
(
SELECT
*,
CHARINDEX(';',col + ';', Number) AS EndPt
FROM
@tbl
CROSS JOIN
#NUMBER
WHERE
SUBSTRING(';'+col ,Number,1) = ';'
)X
)Y
GROUP BY
SplitMyString
DROP TABLE #NUMBER
DECLARE @str VARCHAR(max)
SELECT @str= CONCAT (Coalesce(@str + ‘;’, ”) , cols)
FROM SplitWithCount
DECLARE @TmpTable AS TABLE (Value VARCHAR(MAX))
;WITH Split(stpos,endpos) AS
(
SELECT 0 AS stpos, CHARINDEX(‘;’,@str) AS endpos
UNION ALL
SELECT CAST(endpos AS INT)+1, CHARINDEX(‘;’,@str,endpos+1)
FROM Split
WHERE endpos > 0
)
SELECT SUBSTRING(@str,stpos,COALESCE(NULLIF(endpos,0),LEN(@str)+1)-stpos) AS SplitMyString ,COUNT(1) AS Counts
FROM Split
GROUP BY SUBSTRING(@str,stpos,COALESCE(NULLIF(endpos,0),LEN(@str)+1)-stpos)
— declaration
DECLARE @content VARCHAR(8000)
DECLARE @delimiter CHAR(1)
— Intialialization
SELECT @delimiter = ‘;’
— Turn the rows into one string
SELECT @content = LEFT(cols, LEN(cols) – 1)
FROM (
SELECT cols + @delimiter
FROM SplitWithCount
FOR XML PATH (”)
) c (cols)
— recursive work
;WITH Split_CTE(StartPosition, EndPosition) AS
(
SELECT 0 AS StartPosition, CHARINDEX(@delimiter, @content) AS EndPosition
UNION ALL
SELECT EndPosition + 1, CHARINDEX(@delimiter, @content, EndPosition + 1)
FROM Split_CTE
WHERE EndPosition > 0
)
SELECT SUBSTRING(@content, StartPosition,COALESCE(NULLIF(EndPosition, 0), LEN(@content) + 1) – StartPosition) as ‘SplitString’ , COUNT(*) as ‘Counts’
FROM Split_CTE
GROUP BY SUBSTRING(@content, StartPosition,COALESCE(NULLIF(EndPosition, 0), LEN(@content) + 1) – StartPosition)
i think outside the box
declare @sql varchar(max) = ”
select @sql = @sql + ‘ as nm union all select ”’ + replace(cols, ‘;’, ”’ as nm union all select ”’) + ”” from splitwithcount
set @sql = ‘select nm, count(*) cnt from (‘ + SUBSTRING(@sql, 18, len(@sql)) + ‘) a group by nm’
exec(@sql)
Quick and dirty but different :-) – works for up to 4 concated strings (if they are not to long). Instead of UNION ALL I could have used UNPIVOT too:
good try Franz ,
this one fast solution but only for splitting 5 values not more then it. fastest but some how tough to understand solution is using XML that has given by Pinal Sir.
Thanks Namrata.
BTW: you can use [ code ] and [ /code ] (without spaces) to post formated code
Cool Trick Thomas! I have changed that to [sql ] and [/sql ] (without spaces) and now it looks better with color code.
I like this puzzle :)
This could be another answer but dynamic one.
SELECT a.cols, COUNT(a.cols) AS CT
FROM
(
select parsename(replace(cols,’;’,’.’),1) as cols
from SplitWithCount
UNION ALL
select parsename(replace(cols,’;’,’.’),2) as cols
from SplitWithCount
UNION ALL
select parsename(replace(cols,’;’,’.’),3) as cols
from SplitWithCount
) a
WHERE NOT a.cols IS NULL
GROUP BY a.cols
Let me try this.
—————————————
declare @t table (cl varchar(100))
declare @i int=0
insert @t(cl) values (‘Pinal;Dave;Kumar’),(‘Pinal;Dave’),(‘Pinal’)
;
with lista(p1,p2) as(
select
substring(cl,1,case when charindex(‘;’,cl)=0 then len(cl) else charindex(‘;’,cl)-1 end)
,case
when charindex(‘;’,cl)=0 then null
else substring(cl, charindex(‘;’,cl)+1, len(cl)) end
from @t
union all
select
substring(p2,1,case when charindex(‘;’,p2)=0 then len(p2) else charindex(‘;’,p2)-1 end)
,case
when charindex(‘;’,p2)=0 then null
else substring(p2, charindex(‘;’,p2)+1, len(p2)) end
from lista
where p2 is not null)
select
p1 StringPart
, count(*) COUNTS
from lista
group by p1
; With Data
As
(
SELECT Cols
, Cast(1 As Int) [Level]
, Cast(1 As Int) [Start]
, Cast(Case When CharIndex(‘;’,Cols) > 0 Then CharIndex(‘;’,Cols)-1 Else Len(Cols) End As Int) [End]
, Cast(Case When CharIndex(‘;’,D.Cols) > 0 Then Stuff(D.Cols, 1, CharIndex(‘;’,D.Cols), ”) End As Varchar(100)) [NextStr]
FROM SplitWithCount D
Union All
Select D.Cols
, Cast(D.Level + 1 As Int) [Level]
, Cast(D.Start+ D.[End]+1 As Int) [Start]
, Cast(Case When CharIndex(‘;’,D.NextStr) > 0 Then CharIndex(‘;’,D.NextStr)-1 Else Len(D.NextStr) End As Int) [End]
, Cast(Case When CharIndex(‘;’,D.NextStr) > 0 Then Stuff(D.NextStr, 1, CharIndex(‘;’,D.NextStr), ”) End As Varchar(100)) [NextStr]
From Data D
Where D.NextStr ”
)
Select SubString(D.Cols, D.Start, D.[End]) [SplitMyString], Count(1) [Counts]
From Data D
Group By SubString(D.Cols, D.Start, D.[End])