Strip out HTML Tags to Display Plain Text in XSLT
I was having a requirement with my on-going project, in which I need to display the Blog RSS on Home Page. I used BlogEngine.NET
with some visual modification.
Now, on home page I need to display latest 4 entries of entered Blogs which I easily get thru RSS link from BlogEngine.NET.
But, the issue comes when I need to render the contents; that is latest 4 Blog entrties. They are entered using FCKEditor tool, So, I was getting text with HTML tags.
Because of BlogEngine.NET uses FCKEditor while posting a new Blog Entry, there are useless <p>, <div>, <Span>, <font> and tags and values and fonts with different size and colors comes with the Blog Post entry. Specially, when people directly copy and paste some contents from another web source.
I need to Strip out these HTML tags and display Plain Text for a consistant look out on my home page, so these additional <p>, <div>, <Span>, <font> and tags would not blot on my home page. To display RSS entries, means simply create a RSS Feed Reader (I think I will blog on that also), I used XSLT file to format the RSS content.
I found below function in XSLT which will remove HTML tags from my Description Field from DB (which is filled from FCKEditor):
Below is the function to remove HTML tags:
<xsl:when test="contains($html, '<')">
<xsl:value-of select="substring-before($html, '<')"/>
<!-- Recurse through HTML -->
<xsl:with-param name="html" select="substring-after($html, '>')"/>
You can see that the function name is removeHtmlTags which accepts one argument / parameter named as html which is my Description field that contains HTML Tags.
Logic is simple, its a recursive function which finds for '<' that is '<' means starting of any HTML Tag and take out the substring after this '<' Tag using substring-before() function as substring-before($html, '<') and again call the function with the rest of the string left after '>' that is '>' Tag.
This is how this function will be called:
<xsl:with-param name="html" select="DescriptionField" />
<div height='40' class='blog_text'>
<xsl:value-of disable-output-escaping="yes" select="substring($pureText, 0, 175)"/>
One Variable is declared as pureText. removeHtmlTags() function will strip out the HTML Tags and return the Plain Text values in this pureText variable.
I am passsin DescriptionField that is my DB Field with HTML Tags.
Finally, I am displaying max 175 chars of Plain Text as substring($pureText, 0, 175) inside a DIV.