Solr
  1. Solr
  2. SOLR-1003

XPathEntityprocessor must allow slurping all text from a given xml node and its children

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 1.4
    • Labels:
      None

      Description

      take an example:

      <xhtml:p>This text is 
        <xhtml:b>bold</xhtml:b> and this text is 
        <xhtml:u>underlined</xhtml:u>!
      </xhtml:p>
      

      It may be useful to get all the text from all the tags in <xhtml: p> ignoring the tag names .

      the configuration of the field may look like

      <field column="para" xpath="/p" flatten="true"/>
      

        Issue Links

          Activity

          Hide
          Fergus McMenemie added a comment -

          What is the difference between the HTMLStripTransformer and what is proposed here? Surely both would return:-

          "This text is bold and this text is underlined!"

          Show
          Fergus McMenemie added a comment - What is the difference between the HTMLStripTransformer and what is proposed here? Surely both would return:- "This text is bold and this text is underlined!"
          Hide
          Shalin Shekhar Mangar added a comment -

          No, not really. If HTML is embedded inside an XML document it needs to be encoded properly (replace '<' with < etc.). The example described here does not contain HTML, rather it contains XML nodes inside the "xhtml : p" node mixed with Text nodes. This is the same example which led to the discovery of SOLR-999 issue.

          Show
          Shalin Shekhar Mangar added a comment - No, not really. If HTML is embedded inside an XML document it needs to be encoded properly (replace '<' with < etc.). The example described here does not contain HTML, rather it contains XML nodes inside the "xhtml : p" node mixed with Text nodes. This is the same example which led to the discovery of SOLR-999 issue.
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 741268.

          Thanks Noble!

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 741268. Thanks Noble!
          Hide
          Grant Ingersoll added a comment -

          Bulk close for Solr 1.4

          Show
          Grant Ingersoll added a comment - Bulk close for Solr 1.4

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Noble Paul
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development