I use following script very often and I realized that I have never shared this script on this blog before. Creating Comma Separated Values (CSV) from Table Column is a very common task, and we all do this many times a day. Let us see the example that I use frequently and its output.
USE AdventureWorks GO -- Check Table Column SELECT Name FROM HumanResources.Shift GO -- Get CSV values SELECT SUBSTRING( (SELECT ',' + s.Name FROM HumanResources.Shift s ORDER BY s.Name FOR XML PATH('')),2,200000) AS CSV GO
I consider XML as the best solution in terms of code and performance. Further, as I totally prefer this option, I am not even including the linka to my other articles, where I have described other options.
Do you use any other method to resolve this issue? Can you find any significant difference in performance between these options? Please leave your comment here.
UPDATE: SQL Server 2016 has new feature of JSON which now removes the need of using XML for the purpose of CSV. Here are few of the related articles for SQL Server 2016 and JSON.
- SQL SERVER – 2016 – Opening JSON with OPENJSON()
- SQL SERVER – 2016 – Check Value as JSON With ISJSON()
- SQL SERVER – JSON Support for Transmitting Data for SQL Server 2016 – Notes from the Field #114
- SQL SERVER – Getting started and creating JSON using SQL Server 2016
Reference: Pinal Dave (https://blog.sqlauthority.com)
126 Comments. Leave new
Could you explain (or point me to a good resource) “FOR XML PATH”?
I was actually working on a similar query today and wanted to order the select for my CSV query, but got an error indicating you can’t use an ORDER BY clause in a derived table and suggesting the use of FOR XML PATH.
Thanks!
Ryan
You can use Order by clause. However you need to use Top 100 Percent in the SELECT clause
I have always wondered why they don’t put a Concatenate function into SQL Server.
Although the CTE approach might be the most “SQL” method of concatenation, I personally prefer the CLR approach. Here is a pretty good example of this:
h ttp://msdn.microsoft.com/en-us/library/ms254508%28VS.80%29.aspx
I also believe that this is a good article discussing the many many ways that people try to concatenate records:
h ttp://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/
The result is perfect but find a bit complicated…what is 2,200000 has to do in this????
is it the RANGE????
Thanks,
Nitin Sharma
2 is used to omit the leading comma, 200000 is just a number that would be maximum length of the resulted string. You can use any number in place of 200000
DECLARE @a AS VARCHAR(4000)
SET @a =”
SELECT @a = @a + NAME + ‘,’
FROM HumanResources.Shift
SELECT @a
Dinesh, You will end up with extra Comma..
you can try this
DECLARE @a AS VARCHAR(4000) = ”
SELECT @a = @a + CASE WHEN @a=” THEN ” ELSE ‘, ‘ END + NAME
FROM HumanResources.Shift
I’ve used a managed function with a StringBuilder – it results in a better performance.
Hi Nitin,
The 2,200000 is part of the SUBSTRING Function arguments, it says grab the characters starting from 2nd position to position 200000
O i c…. i guessed it right…
Thanks @ ramdas.
We can use stuff fn instead of substring, just to keep the code clean.
Also, it will be helpful if we want to handle data larger than 200000
select stuff(query using XML path),1,1,”) AS csv
Yes. This is the more reliable method and there is no need to worry about actual length of the output
The 2, 200000 is ugly as previously pointed out.
Code that imposes hard coded restrictions like this should not be used as suggested best practices or as teaching statements.
DECLARE @Csv varchar(Max)
SELECT @Csv= COALESCE(@Csv + ‘, ‘, ”) +
CAST(s.Name AS varchar(50))
FROM HumanResources.Shift s
ORDER BY s.Name
SELECT @Csv
This will only concatenate all values regardless of group.
If you want to reset for each group, you need to use for xml approach with group by clause
I like your solution but I find that it in many ways is a one off solution and unfortionately will not work if you have multiple columns. Let say you had two tables
Table A
->a_id
Table B
->b_id
->name
->a_id
And you wanted the following results
[a_id],[name,name,name,name]
If you want to achieve multiple columns you need to create a User Defined Aggregate (see below) so that all you need to do is write a simple query like the one below
Select a_id, toCSV(name)
FROM A
INNER JOIN B on b.a_id = A.a_id
[Serializable]
[SqlUserDefinedAggregate(
Format.UserDefined, //use clr serialization to serialize the intermediate result
IsInvariantToNulls = true, //optimizer property
IsInvariantToDuplicates = false, //optimizer property
IsInvariantToOrder = false, //optimizer property
MaxByteSize = 8000) //maximum size in bytes of persisted value
]
public class toCSV : IBinarySerialize
{
///
/// The variable that holds the intermediate result of the concatenation
///
private StringBuilder intermediateResult;
///
/// Initialize the internal data structures
///
public void Init()
{
this.intermediateResult = new StringBuilder();
}
///
/// Accumulate the next value, not if the value is null
///
///
public void Accumulate(SqlString value)
{
if (value.IsNull)
{
return;
}
this.intermediateResult.Append(value.Value).Append(‘,’);
}
///
/// Merge the partially computed aggregate with this aggregate.
///
///
public void Merge(toCSV other)
{
this.intermediateResult.Append(other.intermediateResult);
}
///
/// Called at the end of aggregation, to return the results of the aggregation.
///
///
public SqlString Terminate()
{
string output = string.Empty;
//delete the trailing comma, if any
if (this.intermediateResult != null
&& this.intermediateResult.Length > 0)
{
output = this.intermediateResult.ToString(0, this.intermediateResult.Length – 1);
}
return new SqlString(output);
}
public void Read(BinaryReader r)
{
intermediateResult = new StringBuilder(r.ReadString());
}
public void Write(BinaryWriter w)
{
w.Write(this.intermediateResult.ToString());
}
}
Hi Bret,
I was looking for the exact solution. Your toCSV function helped me achieve a solution that I was looking for.
Thanks,
Jitendra
hi
I have a table ‘company’ and it have a column ‘keywords’ where the values are stored like
KEYWORDS
Row1 ac, bg, hkl, jh, od
Row2 gh,jd,kl,ks
Now what i need is that i want to create a new table name as keywords and it store data like
KEYWORDS
Row1 ac
Row2 bg
Row3 hkl
Row4 jh
…………………………
……………………..
IS it possible to do that??
please help
Thanks and regards.
Hi,
The same thing can be performed using below function also:
CREATE FUNCTION [dbo].[Split](@List VARCHAR(6000), @SplitOn VARCHAR(5))
RETURNS @RtnValue TABLE([Value] VARCHAR(100))
AS
BEGIN
WITH s(start) AS
(
SELECT DISTINCT CHARINDEX(‘,’,’,’+@List+’,’,p)
FROM
(SELECT number p FROM master..spt_values WHERE type=’p’ and numbers.start)-start-1)
FROM s WHERE start<len(@List)+2
)
INSERT INTO @RtnValue (Value)
SELECT chunk FROM chunks
RETURN
END
—Use the function Split with your table as below:
SELECT fSplit.Value AS Result
FROM CROSS APPLY Split(,’,’) fSplit
Kind Regards,
Pinal Dave
Dude this code is incorrect.
The above would require correction that could execute perfectly
Hi,
I have a table like,
ID | Value
—- ——–
1000 1
1000 2
1000 3
1001 2
1001 3
1001 4
1001 5
I want the output like this
1000 | 1,2,3
1001 | 2.3.4.5
But when i use the below query,
SELECT ID,(STUFF(select ‘, ‘+value from tblA group by value FOR XML PATH(”)),1,2,”) FROM tblA
I am getting the output like shown below
1000 | 1,2,3,4,5
1001 | 1,2,3,4,5
Which is wrong.. please help
Regards
Hi
Can i get the same result by using functions….
Hi Taher,
I found that you applied GROUP BY at wrong place.
To achieve result as expected you need to change query to:
DECLARE @tblA TABLE(ID INT, Value INT)
INSERT INTO @tblA VALUES(1000,1)
INSERT INTO @tblA VALUES(1000,2)
INSERT INTO @tblA VALUES(1000,3)
INSERT INTO @tblA VALUES(1001,2)
INSERT INTO @tblA VALUES(1001,3)
INSERT INTO @tblA VALUES(1001,4)
INSERT INTO @tblA VALUES(1001,5)
–SELECT * from @tblA
SELECT ID,
SUBSTRING(
(
select ‘ ,’+ CAST(value AS VARCHAR)
from @tblA b
WHERE a.ID = b.ID
FOR XML PATH(”)
)
,3,100)
FROM @tblA a
GROUP BY a.ID
Let me know if it helps you.
Thanks,
Tejas
SQLYoga.com
i am getting an error like incorrect sysntax near FOR XML path..
For XML is supported from version 2005 onwards
For version prior to 2005, you should use a function that concatenates data
Hi,
Thanks it works fine.
Thanks ,
vinutha
Hello Tejas,
Thank you very much for your reply. It solved my problem. I was doing group by inside because i wanted to club all similar records (main intention being not to get duplicate records). But now i figured that same thing can be achieved by using DISTINCT.
Thanks A lot !!!
Taher
consider i am having a employee table . iwant my result to be dispalyed like the following.
emp name reporting order
steve \steve
austin \steve\austin
andrew \steve\austin\andrew
bruce \steve\austin\andrew\bruce
thank you in advance.
Akhiesh
Try using Recursive CTE
I got a little giddy when I tried this query on one of the databases I work on. I had used SQL functions to do this in the past… great post! Thanks for sharing.
How would you remove duplicate comma seperated value from below query (sai,sam,sai)
USE tempdb;
GO
–drop table t1
CREATE TABLE t1 (id INT, NAME VARCHAR(MAX));
INSERT t1 values (1,’Jamie’);
INSERT t1 values (1,’Joe’);
INSERT t1 values (1,’John’);
INSERT t1 values (2,’Sai’);
INSERT t1 values (2,’Sam’);
INSERT t1 values (2,’Sai’);
GO
select
id,
stuff((
select ‘,’ + t.[name]
from t1 t
where t.id = t1.id
order by t.[name]
for xml path(”)
),1,1,”) as name_csv
from t1
group by id
;
It should be
select
t.id,
stuff((
select distinct ‘,’ + t1.[name]
from t1
where t.id = t1.id
for xml path(”)
),1,1,”) as name_csv
from t1 as t
group by t.id
;
Hi,
Ur work is good will it is good for mysql
Suggest some better work for mysql
MySQL has a built-in function to do this
SELECT col,GROUP_CONCAT(some_col,’,’) FROM your_table
GROUP BY col
Hi Pinal,
I’ve used the CTE method , Can you please tell me the performance of this approach..
create table t1
(
val varchar(100),
)
insert into t1 values (‘A’)
insert into t1 values (‘B’)
insert into t1 values (‘C’)
insert into t1 values (‘D’)
insert into t1 values (‘E’)
with CTE (id,val)
AS
(
select id,val from (select Row_number() over (order by val) as id, val from t1) e where id = 1
union all
select e1.id, CAST ( CTE.val + ‘,’ + e1.val As varchar(100)) As val from CTE
inner join (select Row_number() over (order by val) as id, val from t1 ) e1 on e1.id = CTE.id + 1
)
select top 1 val from CTE order by id desc