Strip out HTML Tags to Display Plain Text in XSLT

I was having a requirement with my on-going project, in which I need to display the Blog RSS on Home Page. I used BlogEngine.NET with some visual modification.
Now, on home page I need to display latest 4 entries of entered Blogs which I easily get thru RSS link from BlogEngine.NET.

But, the issue comes when I need to render the contents; that is latest 4 Blog entrties. They are entered using FCKEditor tool, So, I was getting text with HTML tags.
Because of BlogEngine.NET uses FCKEditor while posting a new Blog Entry, there are useless <p>, <div>, <Span>, <font> and &nbsp; tags and values and fonts with different size and colors comes with the Blog Post entry. Specially, when people directly copy and paste some contents from another web source.

I need to Strip out these HTML tags and display Plain Text for a consistant look out on my home page, so these additional <p>, <div>, <Span>, <font> and &nbsp; tags would not blot on my home page. To display RSS entries, means simply create a RSS Feed Reader (I think I will blog on that also), I used XSLT file to format the RSS content.

I found below function in XSLT which will remove HTML tags from my Description Field from DB (which is filled from FCKEditor):
Below is the function to remove HTML tags:

  <xsl:template name="removeHtmlTags">
    <xsl:param name="html"/>
      <xsl:when test="contains($html, '&lt;')">
        <xsl:value-of select="substring-before($html, '&lt;')"/>
        <!-- Recurse through HTML -->
        <xsl:call-template name="removeHtmlTags">
          <xsl:with-param name="html" select="substring-after($html, '&gt;')"/>
        <xsl:value-of select="$html"/>

You can see that the function name is removeHtmlTags which accepts one argument / parameter named as html which is my Description field that contains HTML Tags.
Logic is simple, its a recursive function which finds for '&lt;' that is '<' means starting of any HTML Tag and take out the substring after this '<' Tag using substring-before() function as substring-before($html, '&lt;') and again call the function with the rest of the string left after  '&gt;' that is '>' Tag.

This is how this function will be called:

    <xsl:template name="RssCell">
        <xsl:variable name="pureText">
            <xsl:call-template name="removeHtmlTags">
                <xsl:with-param name="html" select="DescriptionField" />

        <div height='40' class='blog_text'>
            <xsl:value-of disable-output-escaping="yes"  select="substring($pureText, 0, 175)"/>

One Variable is declared as pureText. removeHtmlTags() function will strip out the HTML Tags and return the Plain Text values in this pureText variable.
I am passsin DescriptionField that is my DB Field with HTML Tags.

Finally, I am displaying max 175 chars of Plain Text as substring($pureText, 0, 175) inside a DIV.

Thats It!


# Strip out HTML Tags to Display Plain Text in XSLT &laquo; KaushaL.NET

Pingback from  Strip out HTML Tags to Display Plain Text in XSLT « KaushaL.NET

# top wordpress themes

Monday, October 13, 2008 4:17 PM by top wordpress themes

I just wanted to share this nice address, where you can get wordpress themes for free. I use one of the designs for my own blog and it was really easy to install. Just activating it in admin and the job was done. :-)

# re: Strip out HTML Tags to Display Plain Text in XSLT

Friday, November 21, 2008 1:45 AM by AndreiR23

Have a look on my own implementation of PHP's strip_tags function here -> <a href=">

# re: Strip out HTML Tags to Display Plain Text in XSLT

Thursday, April 29, 2010 5:46 AM by elizas

In Some cases while displaying a large number it will be nice if we can format the number to a more readable format

Like : Reputation Point : 537456

Can be more readable if we can write it as Reputation Point : 537,456

ASP.NET provide features like String.Format(format,arg0) to format arguments into different forms.

For above solution you can implement

Response.Write(String.Format("{0:#,###}", 123456789));

Which will print 123,456,789

{0:#,###} → Known as the format string where “{ ,}”are compulsory to mentation.

The first part before ':' represent the argument number & it will be an integer.

The second part after ':' represent the format that you want your argument to be converted.

# book review blogs

Wednesday, May 05, 2010 3:28 AM by book review blogs

Imagine if one million Wordpress blogs started pinging, millions of users started subscribing and unsubscribing and notifications started getting sent all over the place. It would be scalability hell all over again. In fact, it would be worse. It wouldn

# Graco Nautilus 3 In 1 Car Seat

Saturday, May 08, 2010 7:20 PM by Graco Nautilus 3 In 1 Car Seat

Great One keep doing the good job I've pointed out your post on my blog see it here

# Removing HTML from XSLT in SharePoint List &laquo; samiv2

Tuesday, November 20, 2012 5:27 AM by Removing HTML from XSLT in SharePoint List « samiv2

Pingback from  Removing HTML from XSLT in SharePoint List « samiv2