SQL SERVER – 2005 – UDF – User Defined Function to Strip HTML – Parse HTML – No Regular Expression

One of the developers at my company asked is it possible to parse HTML and retrieve only TEXT from it without using regular expression. He wanted to remove everything between < and > and keep only Text. I found the question very interesting and quickly wrote UDF which does not use regular expression. Let us see how to parse HTML without regular expression.

Following UDF takes input as HTML and returns TEXT only. If there is any single quotes in HTML they should be replaced with two single quotes (not double quote) before it is passed as input to function.

CREATE FUNCTION [dbo].[udf_StripHTML] (@HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT
SET @Start = CHARINDEX('<',@HTMLText) SET @End = 
CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText)) 
SET @Length = (@End - @Start) + 1 WHILE @Start > 0
AND @End > 0
AND @Length > 0
BEGIN
SET @HTMLText = STUFF(@HTMLText,@Start,@Length,'')
SET @Start = CHARINDEX('<',@HTMLText) SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
END
RETURN LTRIM(RTRIM(@HTMLText))
END
GO

Test above function like this :

SELECT dbo.udf_StripHTML('<b>UDF at SQLAuthority.com </b>

<a href="http://www.SQLAuthority.com">SQLAuthority.com</a>')

Result Set:

UDF at SQLAuthority.com SQLAuthority.com

If you want to see this example in action click on Image. It will open large image.

SQL SERVER - 2005 - UDF - User Defined Function to Strip HTML - Parse HTML - No Regular Expression HTMLUDF_S

Let me know what think of blog post by leaving your note in the comment sections.

Reference: Pinal Dave (https://blog.sqlauthority.com)

SQL Function, SQL Scripts, SQL Server, SQL String
Previous Post
SQL SERVER – sp_HelpText for sp_HelpText – Puzzle
Next Post
SQL SERVER – De-fragmentation of Database at Operating System to Improve Performance

Related Posts

95 Comments. Leave new

  • Great code! It works perfectly, thanks for the time save!

    Reply
  • GREAT USER DEFINED FUNCTION

    I can use this function in where clause for example

    select * from table1 where dbo.udf_StripHTML(Content1) like ‘%Hello%’

    Reply
  • thanks for sharing the function,
    this is a very handy tool indeed :)

    Reply
  • Hi Pinal and everyone,

    This is precisely what I was looking for. But some time when I use this function, query times out. Can you suggest me as I am at great problem figuring it out.

    Reply
  • Karthikeyan Ponnusamy
    April 25, 2011 2:30 pm

    Hi,
    I have a field where i may have the HTML tags with the actual data. I want to search through this field, ignoring the HTML tags and only the displayable data.
    Eg. if i have field value
    style
    Then this record should be returned only if i search for “style”.
    For font, top, pad it should’nt be returned.
    I think regular expressions will be useful. I am using MySql 5. Would be helpful if you provide the whole syntax
    Thanks in advance
    karthik

    Reply
  • hello this code is complete

    remove tags

    ALTER FUNCTION [dbo].[udf_StripHTML]
    (@HTMLText VARCHAR(MAX))
    RETURNS VARCHAR(MAX)
    AS
    BEGIN
    DECLARE @Start INT
    DECLARE @End INT
    DECLARE @Length INT

    /* tag */
    SET @Start = CHARINDEX(”,@HTMLText)
    SET @End = CHARINDEX(”,@HTMLText)+5
    SET @Length = (@End – @Start) + 1
    WHILE @Start > 0
    AND @End > 0
    AND @Length > 0
    BEGIN
    SET @HTMLText = STUFF(@HTMLText,@Start,@Length,”)
    SET @Start = CHARINDEX(”,@HTMLText)
    SET @End = CHARINDEX(”,@HTMLText)+5
    SET @Length = (@End – @Start) + 1
    END

    /* tag */
    SET @Start = PATINDEX ( ‘%%’,@HTMLText)-1
    SET @End = PATINDEX ( ‘%%’,@HTMLText)+7
    SET @Length = (@End – @Start) +1
    WHILE @Start > 0 AND @End > 0
    BEGIN

    if @Length > 0
    BEGIN
    /* tag …. */
    SET @HTMLText = STUFF(@HTMLText,@Start,@Length,”)
    END
    ELSE
    BEGIN
    /*only tag */
    SET @HTMLText = STUFF(@HTMLText,@End-8,9,”)
    END
    SET @Start = PATINDEX ( ‘%%’,@HTMLText)-1
    SET @End = PATINDEX ( ‘%%’,@HTMLText)+7
    SET @Length = (@End – @Start) +1
    END

    /*tag general */
    SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0
    AND @End > 0
    AND @Length > 0
    BEGIN
    SET @HTMLText = STUFF(@HTMLText,@Start,@Length,”)
    SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0)
    BEGIN
    SELECT @vcEncoded = SUBSTRING(@vcResult, @siPos, 6)
    SELECT @siChar = CAST(SUBSTRING(@vcEncoded, 3, 3) AS SMALLINT)
    SELECT @vcResult = REPLACE(@vcResult, @vcEncoded, NCHAR(@siChar))
    SELECT @siPos = PATINDEX(‘%&#___;%’, @vcResult)
    END

    SELECT @siPos = PATINDEX(‘%&#____;%’, @vcResult)
    WHILE (@siPos > 0)
    BEGIN
    SELECT @vcEncoded = SUBSTRING(@vcResult, @siPos, 7)
    SELECT @siChar = CAST(SUBSTRING(@vcEncoded, 3, 4) AS SMALLINT)
    SELECT @vcResult = REPLACE(@vcResult, @vcEncoded, NCHAR(@siChar))
    SELECT @siPos = PATINDEX(‘%&#____;%’, @vcResult)
    END

    SELECT @vcResult = REPLACE(@vcResult, ‘"’, NCHAR(0x0022))
    SELECT @vcResult = REPLACE(@vcResult, ‘&’, NCHAR(0x0026))
    SELECT @vcResult = REPLACE(@vcResult, ‘<‘, NCHAR(0x003c))
    SELECT @vcResult = REPLACE(@vcResult, ‘>’, NCHAR(0x003e))
    SELECT @vcResult = REPLACE(@vcResult, ‘ ‘, NCHAR(0x00a0))
    SELECT @vcResult = REPLACE(@vcResult, ‘¡’, NCHAR(0x00a1))
    SELECT @vcResult = REPLACE(@vcResult, ‘¢’, NCHAR(0x00a2))
    SELECT @vcResult = REPLACE(@vcResult, ‘£’, NCHAR(0x00a3))
    SELECT @vcResult = REPLACE(@vcResult, ‘¤’, NCHAR(0x00a4))
    SELECT @vcResult = REPLACE(@vcResult, ‘¥’, NCHAR(0x00a5))
    SELECT @vcResult = REPLACE(@vcResult, ‘¦’, NCHAR(0x00a6))
    SELECT @vcResult = REPLACE(@vcResult, ‘§’, NCHAR(0x00a7))
    SELECT @vcResult = REPLACE(@vcResult, ‘¨’, NCHAR(0x00a8))
    SELECT @vcResult = REPLACE(@vcResult, ‘©’, NCHAR(0x00a9))
    SELECT @vcResult = REPLACE(@vcResult, ‘ª’, NCHAR(0x00aa))
    SELECT @vcResult = REPLACE(@vcResult, ‘«’, NCHAR(0x00ab))
    SELECT @vcResult = REPLACE(@vcResult, ‘¬’, NCHAR(0x00ac))
    SELECT @vcResult = REPLACE(@vcResult, ‘­’, NCHAR(0x00ad))
    SELECT @vcResult = REPLACE(@vcResult, ‘®’, NCHAR(0x00ae))
    SELECT @vcResult = REPLACE(@vcResult, ‘¯’, NCHAR(0x00af))
    SELECT @vcResult = REPLACE(@vcResult, ‘°’, NCHAR(0x00b0))
    SELECT @vcResult = REPLACE(@vcResult, ‘±’, NCHAR(0x00b1))
    SELECT @vcResult = REPLACE(@vcResult, ‘&sup2;’, NCHAR(0x00b2))
    SELECT @vcResult = REPLACE(@vcResult, ‘&sup3;’, NCHAR(0x00b3))
    SELECT @vcResult = REPLACE(@vcResult, ‘´’, NCHAR(0x00b4))
    SELECT @vcResult = REPLACE(@vcResult, ‘µ’, NCHAR(0x00b5))
    SELECT @vcResult = REPLACE(@vcResult, ‘¶’, NCHAR(0x00b6))
    SELECT @vcResult = REPLACE(@vcResult, ‘·’, NCHAR(0x00b7))
    SELECT @vcResult = REPLACE(@vcResult, ‘¸’, NCHAR(0x00b8))
    SELECT @vcResult = REPLACE(@vcResult, ‘&sup1;’, NCHAR(0x00b9))
    SELECT @vcResult = REPLACE(@vcResult, ‘º’, NCHAR(0x00ba))
    SELECT @vcResult = REPLACE(@vcResult, ‘»’, NCHAR(0x00bb))
    SELECT @vcResult = REPLACE(@vcResult, ‘&frac14;’, NCHAR(0x00bc))
    SELECT @vcResult = REPLACE(@vcResult, ‘&frac12;’, NCHAR(0x00bd))
    SELECT @vcResult = REPLACE(@vcResult, ‘&frac34;’, NCHAR(0x00be))
    SELECT @vcResult = REPLACE(@vcResult, ‘¿’, NCHAR(0x00bf))
    SELECT @vcResult = REPLACE(@vcResult, ‘À’, NCHAR(0x00c0))
    SELECT @vcResult = REPLACE(@vcResult, ‘Á’, NCHAR(0x00e1))
    SELECT @vcResult = REPLACE(@vcResult, ‘Â’, NCHAR(0x00c2))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ã’, NCHAR(0x00c3))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ä’, NCHAR(0x00c4))
    SELECT @vcResult = REPLACE(@vcResult, ‘Å’, NCHAR(0x00c5))
    SELECT @vcResult = REPLACE(@vcResult, ‘Æ’, NCHAR(0x00c6))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ç’, NCHAR(0x00c7))
    SELECT @vcResult = REPLACE(@vcResult, ‘È’, NCHAR(0x00c8))
    SELECT @vcResult = REPLACE(@vcResult, ‘É’, NCHAR(0x00e9))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ê’, NCHAR(0x00ca))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ë’, NCHAR(0x00cb))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ì’, NCHAR(0x00cc))
    SELECT @vcResult = REPLACE(@vcResult, ‘Í’, NCHAR(0x00ed))
    SELECT @vcResult = REPLACE(@vcResult, ‘Î’, NCHAR(0x00ce ))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ï’, NCHAR(0x00cf))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ð’, NCHAR(0x00d0))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ñ’, NCHAR(0x00f1))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ò’, NCHAR(0x00d2))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ó’, NCHAR(0x00f3))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ô’, NCHAR(0x00d4))
    SELECT @vcResult = REPLACE(@vcResult, ‘Õ’, NCHAR(0x00d5))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ö’, NCHAR(0x00d6))
    SELECT @vcResult = REPLACE(@vcResult, ‘×’, NCHAR(0x00d7))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ø’, NCHAR(0x00d8))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ù’, NCHAR(0x00d9))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ú’, NCHAR(0x00fa))
    SELECT @vcResult = REPLACE(@vcResult, ‘Û’, NCHAR(0x00db))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ü’, NCHAR(0x00dc))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ý’, NCHAR(0x00dd))
    SELECT @vcResult = REPLACE(@vcResult, ‘Þ’, NCHAR(0x00de))
    SELECT @vcResult = REPLACE(@vcResult, ‘ß’, NCHAR(0x00df))
    SELECT @vcResult = REPLACE(@vcResult, ‘à’, NCHAR(0x00e0))
    SELECT @vcResult = REPLACE(@vcResult, ‘á’, NCHAR(0x00e1))
    SELECT @vcResult = REPLACE(@vcResult, ‘â’, NCHAR(0x00e2))
    SELECT @vcResult = REPLACE(@vcResult, ‘ã’, NCHAR(0x00e3))
    SELECT @vcResult = REPLACE(@vcResult, ‘ä’, NCHAR(0x00e4))
    SELECT @vcResult = REPLACE(@vcResult, ‘å’, NCHAR(0x00e5))
    SELECT @vcResult = REPLACE(@vcResult, ‘æ’, NCHAR(0x00e6))
    SELECT @vcResult = REPLACE(@vcResult, ‘ç’, NCHAR(0x00e7))
    SELECT @vcResult = REPLACE(@vcResult, ‘è’, NCHAR(0x00e8))
    SELECT @vcResult = REPLACE(@vcResult, ‘é’, NCHAR(0x00e9))
    SELECT @vcResult = REPLACE(@vcResult, ‘ê’, NCHAR(0x00ea))
    SELECT @vcResult = REPLACE(@vcResult, ‘ë’, NCHAR(0x00eb))
    SELECT @vcResult = REPLACE(@vcResult, ‘ì’, NCHAR(0x00ec))
    SELECT @vcResult = REPLACE(@vcResult, ‘í’, NCHAR(0x00ed))
    SELECT @vcResult = REPLACE(@vcResult, ‘î’, NCHAR(0x00ee))
    SELECT @vcResult = REPLACE(@vcResult, ‘ï’, NCHAR(0x00ef))
    SELECT @vcResult = REPLACE(@vcResult, ‘ð’, NCHAR(0x00f0))
    SELECT @vcResult = REPLACE(@vcResult, ‘ñ’, NCHAR(0x00f1))
    SELECT @vcResult = REPLACE(@vcResult, ‘ò’, NCHAR(0x00f2))
    SELECT @vcResult = REPLACE(@vcResult, ‘ó’, NCHAR(0x00f3))
    SELECT @vcResult = REPLACE(@vcResult, ‘ô’, NCHAR(0x00f4))
    SELECT @vcResult = REPLACE(@vcResult, ‘õ’, NCHAR(0x00f5))
    SELECT @vcResult = REPLACE(@vcResult, ‘ö’, NCHAR(0x00f6))
    SELECT @vcResult = REPLACE(@vcResult, ‘÷’, NCHAR(0x00f7))
    SELECT @vcResult = REPLACE(@vcResult, ‘ø’, NCHAR(0x00f8))
    SELECT @vcResult = REPLACE(@vcResult, ‘ù’, NCHAR(0x00f9))
    SELECT @vcResult = REPLACE(@vcResult, ‘ú’, NCHAR(0x00fa))
    SELECT @vcResult = REPLACE(@vcResult, ‘û’, NCHAR(0x00fb))
    SELECT @vcResult = REPLACE(@vcResult, ‘ü’, NCHAR(0x00fc))
    SELECT @vcResult = REPLACE(@vcResult, ‘ý’, NCHAR(0x00fd))
    SELECT @vcResult = REPLACE(@vcResult, ‘þ’, NCHAR(0x00fe))
    SELECT @vcResult = REPLACE(@vcResult, ‘ÿ’, NCHAR(0x00ff))
    SELECT @vcResult = REPLACE(@vcResult, ‘Œ’, NCHAR(0x0152))
    SELECT @vcResult = REPLACE(@vcResult, ‘œ’, NCHAR(0x0153))
    SELECT @vcResult = REPLACE(@vcResult, ‘Š’, NCHAR(0x0160))
    SELECT @vcResult = REPLACE(@vcResult, ‘š’, NCHAR(0x0161))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ÿ’, NCHAR(0x0178))
    SELECT @vcResult = REPLACE(@vcResult, ‘ƒ’, NCHAR(0x0192))
    SELECT @vcResult = REPLACE(@vcResult, ‘ˆ’, NCHAR(0x02c6))
    SELECT @vcResult = REPLACE(@vcResult, ‘˜’, NCHAR(0x02dc))
    SELECT @vcResult = REPLACE(@vcResult, ‘Α’, NCHAR(0x0391))
    SELECT @vcResult = REPLACE(@vcResult, ‘Β’, NCHAR(0x0392))
    SELECT @vcResult = REPLACE(@vcResult, ‘Γ’, NCHAR(0x0393))
    SELECT @vcResult = REPLACE(@vcResult, ‘Δ’, NCHAR(0x0394))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ε’, NCHAR(0x0395))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ζ’, NCHAR(0x0396))
    SELECT @vcResult = REPLACE(@vcResult, ‘Η’, NCHAR(0x0397))
    SELECT @vcResult = REPLACE(@vcResult, ‘Θ’, NCHAR(0x0398))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ι’, NCHAR(0x0399))
    SELECT @vcResult = REPLACE(@vcResult, ‘Κ’, NCHAR(0x039a))
    SELECT @vcResult = REPLACE(@vcResult, ‘Λ’, NCHAR(0x039b))
    SELECT @vcResult = REPLACE(@vcResult, ‘Μ’, NCHAR(0x039c))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ν’, NCHAR(0x039d))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ξ’, NCHAR(0x039e))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ο’, NCHAR(0x039f))
    SELECT @vcResult = REPLACE(@vcResult, ‘Π’, NCHAR(0x03a0))
    SELECT @vcResult = REPLACE(@vcResult, ‘& Rho ;’, NCHAR(0x03a1))
    SELECT @vcResult = REPLACE(@vcResult, ‘Σ’, NCHAR(0x03a3))
    SELECT @vcResult = REPLACE(@vcResult, ‘Τ’, NCHAR(0x03a4))
    SELECT @vcResult = REPLACE(@vcResult, ‘Υ’, NCHAR(0x03a5))
    SELECT @vcResult = REPLACE(@vcResult, ‘Φ’, NCHAR(0x03a6))
    SELECT @vcResult = REPLACE(@vcResult, ‘Χ’, NCHAR(0x03a7))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ψ’, NCHAR(0x03a8))
    SELECT @vcResult = REPLACE(@vcResult, ‘Ω’, NCHAR(0x03a9))
    SELECT @vcResult = REPLACE(@vcResult, ‘α’, NCHAR(0x03b1))
    SELECT @vcResult = REPLACE(@vcResult, ‘β’, NCHAR(0x03b2))
    SELECT @vcResult = REPLACE(@vcResult, ‘γ’, NCHAR(0x03b3))
    SELECT @vcResult = REPLACE(@vcResult, ‘δ’, NCHAR(0x03b4))
    SELECT @vcResult = REPLACE(@vcResult, ‘ε’, NCHAR(0x03b5))
    SELECT @vcResult = REPLACE(@vcResult, ‘ζ’, NCHAR(0x03b6))
    SELECT @vcResult = REPLACE(@vcResult, ‘η’, NCHAR(0x03b7))
    SELECT @vcResult = REPLACE(@vcResult, ‘θ’, NCHAR(0x03b8))
    SELECT @vcResult = REPLACE(@vcResult, ‘ι’, NCHAR(0x03b9))
    SELECT @vcResult = REPLACE(@vcResult, ‘κ’, NCHAR(0x03ba))
    SELECT @vcResult = REPLACE(@vcResult, ‘λ’, NCHAR(0x03bb))
    SELECT @vcResult = REPLACE(@vcResult, ‘μ’, NCHAR(0x03bc))
    SELECT @vcResult = REPLACE(@vcResult, ‘ν’, NCHAR(0x03bd))
    SELECT @vcResult = REPLACE(@vcResult, ‘ξ’, NCHAR(0x03be))
    SELECT @vcResult = REPLACE(@vcResult, ‘ο’, NCHAR(0x03bf))
    SELECT @vcResult = REPLACE(@vcResult, ‘π’, NCHAR(0x03c0))
    SELECT @vcResult = REPLACE(@vcResult, ‘ρ’, NCHAR(0x03c1))
    SELECT @vcResult = REPLACE(@vcResult, ‘ς’, NCHAR(0x03c2))
    SELECT @vcResult = REPLACE(@vcResult, ‘σ’, NCHAR(0x03c3))
    SELECT @vcResult = REPLACE(@vcResult, ‘τ’, NCHAR(0x03c4))
    SELECT @vcResult = REPLACE(@vcResult, ‘υ’, NCHAR(0x03c5))
    SELECT @vcResult = REPLACE(@vcResult, ‘φ’, NCHAR(0x03c6))
    SELECT @vcResult = REPLACE(@vcResult, ‘χ’, NCHAR(0x03c7))
    SELECT @vcResult = REPLACE(@vcResult, ‘ψ’, NCHAR(0x03c8))
    SELECT @vcResult = REPLACE(@vcResult, ‘ω’, NCHAR(0x03c9))
    SELECT @vcResult = REPLACE(@vcResult, ‘ϑ’, NCHAR(0x03d1))
    SELECT @vcResult = REPLACE(@vcResult, ‘ϒ’, NCHAR(0x03d2))
    SELECT @vcResult = REPLACE(@vcResult, ‘ϖ’, NCHAR(0x03d6))
    SELECT @vcResult = REPLACE(@vcResult, ‘ ’, NCHAR(0x2002))
    SELECT @vcResult = REPLACE(@vcResult, ‘ ’, NCHAR(0x2003))
    SELECT @vcResult = REPLACE(@vcResult, ‘ ’, NCHAR(0x2009))
    SELECT @vcResult = REPLACE(@vcResult, ‘‌’, NCHAR(0x200c))
    SELECT @vcResult = REPLACE(@vcResult, ‘‍’, NCHAR(0x200d))
    SELECT @vcResult = REPLACE(@vcResult, ‘‎’, NCHAR(0x200e))
    SELECT @vcResult = REPLACE(@vcResult, ‘‏’, NCHAR(0x200f))
    SELECT @vcResult = REPLACE(@vcResult, ‘–’, NCHAR(0x2013))
    SELECT @vcResult = REPLACE(@vcResult, ‘—’, NCHAR(0x2014))
    SELECT @vcResult = REPLACE(@vcResult, ‘‘’, NCHAR(0x2018))
    SELECT @vcResult = REPLACE(@vcResult, ‘’’, NCHAR(0x2019))
    SELECT @vcResult = REPLACE(@vcResult, ‘‚’, NCHAR(0x201a))
    SELECT @vcResult = REPLACE(@vcResult, ‘“’, NCHAR(0x201c))
    SELECT @vcResult = REPLACE(@vcResult, ‘”’, NCHAR(0x201d))
    SELECT @vcResult = REPLACE(@vcResult, ‘„’, NCHAR(0x201e))
    SELECT @vcResult = REPLACE(@vcResult, ‘†’, NCHAR(0x2020))
    SELECT @vcResult = REPLACE(@vcResult, ‘‡’, NCHAR(0x2021))
    SELECT @vcResult = REPLACE(@vcResult, ‘•’, NCHAR(0x2022))
    SELECT @vcResult = REPLACE(@vcResult, ‘…’, NCHAR(0x2026))
    SELECT @vcResult = REPLACE(@vcResult, ‘‰’, NCHAR(0x2030))
    SELECT @vcResult = REPLACE(@vcResult, ‘′’, NCHAR(0x2032))
    SELECT @vcResult = REPLACE(@vcResult, ‘″’, NCHAR(0x2033))
    SELECT @vcResult = REPLACE(@vcResult, ‘‹’, NCHAR(0x2039))
    SELECT @vcResult = REPLACE(@vcResult, ‘›’, NCHAR(0x203a))
    SELECT @vcResult = REPLACE(@vcResult, ‘‾’, NCHAR(0x203e))
    SELECT @vcResult = REPLACE(@vcResult, ‘⁄’, NCHAR(0x2044))
    SELECT @vcResult = REPLACE(@vcResult, ‘€’, NCHAR(0x20ac))
    SELECT @vcResult = REPLACE(@vcResult, ‘ℑ’, NCHAR(0x2111))
    SELECT @vcResult = REPLACE(@vcResult, ‘℘’, NCHAR(0x2118))
    SELECT @vcResult = REPLACE(@vcResult, ‘ℜ’, NCHAR(0x211c))
    SELECT @vcResult = REPLACE(@vcResult, ‘™’, NCHAR(0x2122))
    SELECT @vcResult = REPLACE(@vcResult, ‘ℵ’, NCHAR(0x2135))
    SELECT @vcResult = REPLACE(@vcResult, ‘←’, NCHAR(0x2190))
    SELECT @vcResult = REPLACE(@vcResult, ‘↑’, NCHAR(0x2191))
    SELECT @vcResult = REPLACE(@vcResult, ‘→’, NCHAR(0x2192))
    SELECT @vcResult = REPLACE(@vcResult, ‘↓’, NCHAR(0x2193))
    SELECT @vcResult = REPLACE(@vcResult, ‘↔’, NCHAR(0x2194))
    SELECT @vcResult = REPLACE(@vcResult, ‘↵’, NCHAR(0x21b5))
    SELECT @vcResult = REPLACE(@vcResult, ‘⇐’, NCHAR(0x21d0))
    SELECT @vcResult = REPLACE(@vcResult, ‘⇑’, NCHAR(0x21d1))
    SELECT @vcResult = REPLACE(@vcResult, ‘⇒’, NCHAR(0x21d2))
    SELECT @vcResult = REPLACE(@vcResult, ‘⇓’, NCHAR(0x21d3))
    SELECT @vcResult = REPLACE(@vcResult, ‘⇔’, NCHAR(0x21d4))
    SELECT @vcResult = REPLACE(@vcResult, ‘∀’, NCHAR(0x2200))
    SELECT @vcResult = REPLACE(@vcResult, ‘∂’, NCHAR(0x2202))
    SELECT @vcResult = REPLACE(@vcResult, ‘∃’, NCHAR(0x2203))
    SELECT @vcResult = REPLACE(@vcResult, ‘∅’, NCHAR(0x2205))
    SELECT @vcResult = REPLACE(@vcResult, ‘∇’, NCHAR(0x2207))
    SELECT @vcResult = REPLACE(@vcResult, ‘∈’, NCHAR(0x2208))
    SELECT @vcResult = REPLACE(@vcResult, ‘∉’, NCHAR(0x2209))
    SELECT @vcResult = REPLACE(@vcResult, ‘∋’, NCHAR(0x220b))
    SELECT @vcResult = REPLACE(@vcResult, ‘∏’, NCHAR(0x220f))
    SELECT @vcResult = REPLACE(@vcResult, ‘∑’, NCHAR(0x2211))
    SELECT @vcResult = REPLACE(@vcResult, ‘−’, NCHAR(0x2212))
    SELECT @vcResult = REPLACE(@vcResult, ‘∗’, NCHAR(0x2217))
    SELECT @vcResult = REPLACE(@vcResult, ‘√’, NCHAR(0x221a))
    SELECT @vcResult = REPLACE(@vcResult, ‘∝’, NCHAR(0x221d))
    SELECT @vcResult = REPLACE(@vcResult, ‘∞’, NCHAR(0x221e))
    SELECT @vcResult = REPLACE(@vcResult, ‘∠’, NCHAR(0x2220))
    SELECT @vcResult = REPLACE(@vcResult, ‘∧’, NCHAR(0x2227))
    SELECT @vcResult = REPLACE(@vcResult, ‘∨’, NCHAR(0x2228))
    SELECT @vcResult = REPLACE(@vcResult, ‘∩’, NCHAR(0x2229))
    SELECT @vcResult = REPLACE(@vcResult, ‘∪’, NCHAR(0x222a))
    SELECT @vcResult = REPLACE(@vcResult, ‘∫’, NCHAR(0x222b))
    SELECT @vcResult = REPLACE(@vcResult, ‘&there4;’, NCHAR(0x2234))
    SELECT @vcResult = REPLACE(@vcResult, ‘∼’, NCHAR(0x223c))
    SELECT @vcResult = REPLACE(@vcResult, ‘≅’, NCHAR(0x2245))
    SELECT @vcResult = REPLACE(@vcResult, ‘≈’, NCHAR(0x2248))
    SELECT @vcResult = REPLACE(@vcResult, ‘≠’, NCHAR(0x2260))
    SELECT @vcResult = REPLACE(@vcResult, ‘≡’, NCHAR(0x2261))
    SELECT @vcResult = REPLACE(@vcResult, ‘≤’, NCHAR(0x2264))
    SELECT @vcResult = REPLACE(@vcResult, ‘≥’, NCHAR(0x2265))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊂’, NCHAR(0x2282))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊃’, NCHAR(0x2283))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊄’, NCHAR(0x2284))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊆’, NCHAR(0x2286))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊇’, NCHAR(0x2287))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊕’, NCHAR(0x2295))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊗’, NCHAR(0x2297))
    SELECT @vcResult = REPLACE(@vcResult, ‘⊥’, NCHAR(0x22a5))
    SELECT @vcResult = REPLACE(@vcResult, ‘⋅’, NCHAR(0x22c5))
    SELECT @vcResult = REPLACE(@vcResult, ‘⌈’, NCHAR(0x2308))
    SELECT @vcResult = REPLACE(@vcResult, ‘⌉’, NCHAR(0x2309))
    SELECT @vcResult = REPLACE(@vcResult, ‘⌊’, NCHAR(0x230a))
    SELECT @vcResult = REPLACE(@vcResult, ‘⌋’, NCHAR(0x230b))
    SELECT @vcResult = REPLACE(@vcResult, ‘⟨’, NCHAR(0x2329))
    SELECT @vcResult = REPLACE(@vcResult, ‘⟩’, NCHAR(0x232a))
    SELECT @vcResult = REPLACE(@vcResult, ‘◊’, NCHAR(0x25ca))
    SELECT @vcResult = REPLACE(@vcResult, ‘♠’, NCHAR(0x2660))
    SELECT @vcResult = REPLACE(@vcResult, ‘♣’, NCHAR(0x2663))
    SELECT @vcResult = REPLACE(@vcResult, ‘♥’, NCHAR(0x2665))
    SELECT @vcResult = REPLACE(@vcResult, ‘♦’, NCHAR(0x2666))

    SELECT @vcResult = REPLACE(@vcResult, ”, @vcCrLf)

    RETURN @vcResult
    END

    Reply
    • this is not working properly. Can you help? I would like to get this to work for me…seems there are missing chars where you use CHARINDEX (see code under /*tag general */ )…see code above…

      Reply
    • These lines of code above seem to be incomplete. Can someone help me fix this?

      SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0
      AND @End > 0
      AND @Length > 0
      BEGIN
      SET @HTMLText = STUFF(@HTMLText,@Start,@Length,”)
      SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0)

      Reply
  • How can I alter this proc to give it a table parameter and table output?

    Reply
  • Awesome, works well thanx mate

    Reply
  • Christopher Wolff
    January 20, 2012 10:03 pm

    Sorry so late to the party but I just wanted to add one part that I put in to make it perfect.

    RETURN LTRIM(RTRIM(REPLACE(@HTMLText,’ ‘,’ ‘)))

    Thanks for your help on this and all other articles I have used

    Reply
  • Though this article has been written in year 2007, I found it really useful, in 2012. I love the web for that reason. I only didn’t understand why should we replace single quotes with 2 single quotes? What the reason is?

    Reply
    • Just out of curiousity….do you have the same as I do in the code you are using as I am ? I copied and pasted from abofe and I have this code below and seems incomplete/cutoff and It is not working for m.

      SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0
      AND @End > 0
      AND @Length > 0
      BEGIN
      SET @HTMLText = STUFF(@HTMLText,@Start,@Length,”)
      SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0)

      Reply
  • Does anyone know why these lines above are incomplete. Doesn’t work for me….

    These lines of code above seem to be incomplete. Can someone help me fix this?

    SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0
    AND @End > 0
    AND @Length > 0
    BEGIN
    SET @HTMLText = STUFF(@HTMLText,@Start,@Length,”)
    SET @Start = CHARINDEX(”,@HTMLText,CHARINDEX(‘ 0)

    Reply
  • chandan varma
    May 25, 2012 2:42 pm

    how we can the write the prime number program in t-sql using sql server 2005,there are some conditions:-
    1:-we don’t use table,
    2:-use if-else and while loop

    Reply
  • hi pinal ,
    i want to open html file in sql
    so please give me solution

    Reply
  • Hi Pinal Sir,
    The function is very useful
    But I am trying to get string in bold if Tag is specified using TSQL commands.
    Same for other tags also.
    Plz give me soln !!!!!!!!!!!!

    Reply
  • GRACIAS,,,ESTA SENCILLO LA FUNCION.

    Reply
  • कौशल उप्रेती
    November 1, 2012 12:16 pm

    case:if the HTML tag is like HTML BODY , is this method useful for this case . if not please suggests.

    Reply
  • james mcfarland
    January 29, 2014 2:03 am

    Thanks Pinal, and lazycoders – I used the lazycoders version with a few typo fixes (See this StackOverflow post: – all of that together provided me with a great and easy-to-understand solution. Well done generous ones.

    Reply
  • Hi Pinal,

    I have taken a look at the LazyCoder’s SQL function. Could you please explain why can’t we just replace the BREAK or tags with a REPLACE? Why do we need to use a WHILE loop to achieve this replacement?

    Why this?
    – Replace any tags with a newline
    — Replace any tags with a newline
    SET @Start = CHARINDEX(”, @HTMLText)
    SET @End = @Start + 4
    SET @Length = (@End – @Start) + 1

    WHILE (@Start > 0 AND @End > 0 AND @Length > 0) BEGIN
    SET @HTMLText = STUFF(@HTMLText, @Start, @Length, ‘CHAR(13) + CHAR(10)’)
    SET @Start = CHARINDEX(”, @HTMLText)
    SET @End = @Start + 4
    SET @Length = (@End – @Start) + 1
    END

    Why not just?

    SET @HTMLText = REPLACE(@HTMLText, ”, ‘CHAR(13) + CHAR(10)’)

    Thanks,
    Sathish

    Reply
  • Hi,

    In database i have stored the values like (&gt, ;<, /p, >) this , unable to remove the tags. Any solution is there.

    Reply
  • Pinal, you are the best!!! Every time I have a question, you have the answer. You are becoming very popular at the office! Thank you!

    Reply

Leave a Reply