Skip to content Skip to sidebar Skip to footer

Get The Specific Word In Text In Html Page

If I have the following HTML page

Hello world!

Hello and Hello again this is an example

Solution 1:

This is easy to do with XSLT.

XSLT 1.0 solution:

<xsl:stylesheetversion="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"><xsl:outputomit-xml-declaration="yes"indent="yes"/><xsl:strip-spaceelements="*"/><xsl:paramname="pTarget"select="'hello'"/><xsl:paramname="pReplacement"select="'welcome'"/><xsl:variablename="vtargetLength"select=
 "string-length($pTarget)"/><xsl:variablename="vUpper"select=
  "'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/><xsl:variablename="vLower"select=
  "'abcdefghijklmnopqrstuvwxyz'"/><xsl:templatematch="node()|@*"><xsl:copy><xsl:apply-templatesselect="node()|@*"/></xsl:copy></xsl:template><xsl:templatematch="text()"name="replace"><xsl:paramname="pText"select="."/><xsl:variablename="vLowerText"select=
  "translate($pText,$vUpper,$vLower)"/><xsl:choose><xsl:whentest=
   "not(contains(concat(' ', $vLowerText, ' '),
                 concat(' ',$pTarget,' ')
                 )
        )"><xsl:value-ofselect="$pText"/></xsl:when><xsl:otherwise><xsl:variablename="vOffset"select=
    "string-length(
          substring-before(concat(' ', $vLowerText, ' '),
                           concat(' ', $pTarget,' ')
                           )
                   )"/><xsl:value-ofselect="substring($pText, 1, $vOffset)"/><xsl:value-ofselect="$pReplacement"/><xsl:call-templatename="replace"><xsl:with-paramname="pText"select=
      "substring($pText, $vOffset + $vtargetLength+1)"/></xsl:call-template></xsl:otherwise></xsl:choose></xsl:template></xsl:stylesheet>

when this transformation is applied on the provided XML document:

<div><p>
  Hello world!
 </p><p><ahref="example.com"> Hello and Hello again this is an example</a></p></div>

the wanted, correct result is produced:

<div><p>
  welcome world!
 </p><p><ahref="example.com"> welcome and welcome again this is an example</a></p></div>

My assumption is that the matching and replacement is case-insensitive (i.e. "hello" and "heLlo" should both be replaced with "welcome"). In case a case-sensitive match is required, the transformation can be considerably simplified.

XSLT 2.0 Solution:

<xsl:stylesheetversion="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"><xsl:outputomit-xml-declaration="yes"indent="yes"/><xsl:paramname="pTarget"select="'hello'"/><xsl:paramname="pReplacement"select="'welcome'"/><xsl:templatematch="node()|@*"><xsl:copy><xsl:apply-templatesselect="node()|@*"/></xsl:copy></xsl:template><xsl:templatematch="text()[matches(.,$pTarget, 'i')]"><xsl:variablename="vEnlargedRep"select=
   "replace(concat(' ',.,' '),
            concat(' ',$pTarget,' '),
            concat(' ',$pReplacement,' '),
             'i')"/><xsl:variablename="vLen"select="string-length($vEnlargedRep)"/><xsl:sequenceselect=
     "substring($vEnlargedRep,2, $vLen -2)"/></xsl:template></xsl:stylesheet>

when this transformation is applied on the provided XML document (shown above), again the wanted, correct result is produced:

<div><p>
  welcome world!
 </p><p><ahref="example.com"> welcome and welcome again this is an example</a></p></div>

Explanation: Use of the standard XPath 2.0 functions matches() and replace() specifying as the third argument "i" -- a flag for case-insensitive operation.

Post a Comment for "Get The Specific Word In Text In Html Page"