Get The Specific Word In Text In Html Page
If I have the following HTML page
Hello world!
Hello and Hello again this is an example
Solution 1:
This is easy to do with XSLT.
XSLT 1.0 solution:
<xsl:stylesheetversion="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"><xsl:outputomit-xml-declaration="yes"indent="yes"/><xsl:strip-spaceelements="*"/><xsl:paramname="pTarget"select="'hello'"/><xsl:paramname="pReplacement"select="'welcome'"/><xsl:variablename="vtargetLength"select=
"string-length($pTarget)"/><xsl:variablename="vUpper"select=
"'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/><xsl:variablename="vLower"select=
"'abcdefghijklmnopqrstuvwxyz'"/><xsl:templatematch="node()|@*"><xsl:copy><xsl:apply-templatesselect="node()|@*"/></xsl:copy></xsl:template><xsl:templatematch="text()"name="replace"><xsl:paramname="pText"select="."/><xsl:variablename="vLowerText"select=
"translate($pText,$vUpper,$vLower)"/><xsl:choose><xsl:whentest=
"not(contains(concat(' ', $vLowerText, ' '),
concat(' ',$pTarget,' ')
)
)"><xsl:value-ofselect="$pText"/></xsl:when><xsl:otherwise><xsl:variablename="vOffset"select=
"string-length(
substring-before(concat(' ', $vLowerText, ' '),
concat(' ', $pTarget,' ')
)
)"/><xsl:value-ofselect="substring($pText, 1, $vOffset)"/><xsl:value-ofselect="$pReplacement"/><xsl:call-templatename="replace"><xsl:with-paramname="pText"select=
"substring($pText, $vOffset + $vtargetLength+1)"/></xsl:call-template></xsl:otherwise></xsl:choose></xsl:template></xsl:stylesheet>
when this transformation is applied on the provided XML document:
<div><p>
Hello world!
</p><p><ahref="example.com"> Hello and Hello again this is an example</a></p></div>
the wanted, correct result is produced:
<div><p>
welcome world!
</p><p><ahref="example.com"> welcome and welcome again this is an example</a></p></div>
My assumption is that the matching and replacement is case-insensitive (i.e. "hello" and "heLlo" should both be replaced with "welcome"). In case a case-sensitive match is required, the transformation can be considerably simplified.
XSLT 2.0 Solution:
<xsl:stylesheetversion="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:xs="http://www.w3.org/2001/XMLSchema"><xsl:outputomit-xml-declaration="yes"indent="yes"/><xsl:paramname="pTarget"select="'hello'"/><xsl:paramname="pReplacement"select="'welcome'"/><xsl:templatematch="node()|@*"><xsl:copy><xsl:apply-templatesselect="node()|@*"/></xsl:copy></xsl:template><xsl:templatematch="text()[matches(.,$pTarget, 'i')]"><xsl:variablename="vEnlargedRep"select=
"replace(concat(' ',.,' '),
concat(' ',$pTarget,' '),
concat(' ',$pReplacement,' '),
'i')"/><xsl:variablename="vLen"select="string-length($vEnlargedRep)"/><xsl:sequenceselect=
"substring($vEnlargedRep,2, $vLen -2)"/></xsl:template></xsl:stylesheet>
when this transformation is applied on the provided XML document (shown above), again the wanted, correct result is produced:
<div><p>
welcome world!
</p><p><ahref="example.com"> welcome and welcome again this is an example</a></p></div>
Explanation: Use of the standard XPath 2.0 functions matches()
and replace()
specifying as the third argument "i"
-- a flag for case-insensitive operation.
Post a Comment for "Get The Specific Word In Text In Html Page"