SQL SERVER – Puzzle – DISTINCT Values in NVARCHAR

July 17, 2015

SQL, SQL Puzzle, SQL Server, SQL Tips and Tricks

The technique used in this blog is explained in earlier blog SQL SERVER – Creating Dataset Using VALUES Clause Without Creating Table. To read more about Collation and Unicode refer to MSDN. This topic in general is very deep and complicated sometimes that you get to learn something every single time.

While playing with SSIS package, I was getting unique constraint error while data were getting inserted in SQL Server. Later I did further digging and found that it was due to German characters. Here is a simplified version of the problem. Will you be able to crack the puzzle?

You can use the SQL Server Management studio and run below three queries:

SET NOCOUNT ON SELECT DISTINCT columny COLLATE SQL_Latin1_General_CP1_CI_AS AS 'First' FROM (VALUES ('ß'),('SS') ) tablex(columny) ; SELECT DISTINCT columny COLLATE Latin1_General_CI_AS AS 'Second' FROM (VALUES ('ß'),('SS') ) tablex(columny) ; SELECT DISTINCT CAST(columny AS NVARCHAR) COLLATE SQL_Latin1_General_CP1_CI_AS AS 'Third' FROM (VALUES (N'ß'),(N'SS') ) tablex(columny) ;

If you are not having access to SQL Server handy – here is a sample output for them:

SQL SERVER - Puzzle - DISTINCT Values in NVARCHAR ss-puzzle-01

Do you know the reason why ‘ß’ and ‘SS’ are same in nvarchar? Please comment and let me know. I will share surprise learning voucher with you.

Hope we will learn something together.

Reference: Pinal Dave (https://blog.sqlauthority.com)

SQL SERVER – Enhancing Reference Data in Master Data Services – Notes from the Field #089

PowerShell: How to Find Logical and Physical Processors on SQL Server?

No results found.

15 Comments. Leave new

Benjamin Steinfeld
July 17, 2015 8:45 am
ß is a Scharfe S, it’s interchangeable with SS according to Unicode 7.0 Case Folding Rules
Reply
manish
July 17, 2015 9:20 am
character expansion is used for some independent letters on certain collations like
ß can be expended to SS
œ can be expended to OE
æ can be expended to AE
thats why we r getting our result like this
Reply
krmanish1986
July 17, 2015 9:21 am
can be expended to
Reply
krmanish1986
July 17, 2015 9:25 am
there are some independent letters that can be expended while matching using character expansion.. For ex
ß can be expended to SS
æ can be expanded to AE
thats y we r getting our result like this
Reply
- krmanish1986
  July 17, 2015 9:25 am
  On certain collations
  Reply
nixonknr
July 17, 2015 2:04 pm
Cool Post!
In Non-Unicode “Latin1_General_CI_AS” expansion of ‘ß’ to ‘SS’ take place for Unicode and Non-Unicode strings.
In Unicode “SQL_Latin1_General_CP1_CI_AS” expansion of ‘ß’ to ‘SS’ take place only for Unicode strings (N’ß’ and N’SS’).
For Non-Unicode strings direct comparison of string literals take place (without expansion).
So for this one it is matter, whether you’re comparing N’ß’ and N’SS’ or ‘ß’ and ‘SS’.
Cheers.
Reply
Sanjay Monpara
July 17, 2015 3:43 pm
Similar issue:
DECLARE @A VARCHAR(23)=’KW’
DECLARE @B VARCHAR(23)=’KV’
IF @A COLLATE Danish_Norwegian_CI_AI =@B COLLATE Danish_Norwegian_CI_AI
SELECT 1 AS Danish
ELSE
SELECT 0 AS Danish
IF @A COLLATE Finnish_Swedish_100_CI_AI =@B COLLATE Finnish_Swedish_100_CI_AI
SELECT 1 AS Finnish
ELSE
SELECT 0 AS FInnish
–>–Ourput
Danish
——–
0
Finnish
———
1
Reply
Manish
July 18, 2015 7:19 pm
select (null+2+3+5) =null
select sum(salary) FROM TBL_EMPLOYEE =90000
WHERE SALARY IS 30000,30000,30000 , null FOR 4 EMPLOYEES
wHY ?
Reply
- madhivanan
  July 21, 2015 12:59 pm
  It is because SUM function ignore NULL values by default
  Reply
  - Manish
    July 22, 2015 10:19 pm
    select SUM(null+2+3+5) =null
    but sir this also results to NULL..
    is this the wrong way of writing query
    or something else is happening here
  - madhivanan
    July 23, 2015 12:30 pm
    You have used + operator directly on NULL that’s why it is NULL whereas SUM function takes values from each row and ignores NULL
  - madhivanan
    July 23, 2015 12:36 pm
    Also see what this returns
    select sum(col) as col from
    (
    select null as col
    union
    select 2
    union
    select 3
    union
    select 5
    ) as t
  - Manish
    July 25, 2015 9:00 pm
    Got it sir
    Thanx
Amar Raj
June 18, 2016 4:31 pm
Hi Due to this it will happen:-
You should see little difference if the collation is SQL_Latin1_General_CP1_CI_AS or Latin1_General_CI_AS, but both have instances where they are faster or slower than the other.
Latin1_General_CI_AS :- Latin1-General, case-insensitive, accent- sensitive, kanatype-insensitive, width-insensitive
SQL_Latin1_General_CP1_CI_AS:- Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data
Reply
Amar Raj
June 18, 2016 4:54 pm
SELECT * FROM ::fn_helpcollations()
WHERE name IN (
‘SQL_Latin1_General_CP1_CI_AS’,
‘Latin1_General_CI_AS’
)
Reply

June 2025 Discount: Comprehensive Database Performance Health Check | Testimonials

SQL SERVER – Puzzle – DISTINCT Values in NVARCHAR

Related Posts

15 Comments. Leave new

Leave a ReplyCancel reply