Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14501

Switch from aalto-xml to woodstox to handle odd XML features

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.9.0, 3.0.0-alpha4
    • 2.9.0, 3.0.0-alpha4
    • conf
    • None

    Description

      hgadre tried testing solr with a Hadoop 3 client. He saw various test case failures due to what look like functionality gaps in the new aalto-xml stax implementation pulled in by HADOOP-14216:

         [junit4]    > Throwable #1: com.fasterxml.aalto.WFCException: Illegal XML character ('ΓΌ' (code 252))
      ....
         [junit4]    > Caused by: com.fasterxml.aalto.WFCException: General entity reference (&bar;) encountered in entity expanding mode: operation not (yet) implemented
      ...
         [junit4]    > Throwable #1: org.apache.solr.common.SolrException: General entity reference (&wacky;) encountered in entity expanding mode: operation not (yet) implemented
      

      These were from the following test case executions:

      NOTE: reproduce with: ant test  -Dtestcase=DocumentAnalysisRequestHandlerTest -Dtests.method=testCharsetOutsideDocument -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=und -Dtests.timezone=Atlantic/Faeroe -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
      NOTE: reproduce with: ant test  -Dtestcase=MBeansHandlerTest -Dtests.method=testXMLDiffWithExternalEntity -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=en-US -Dtests.timezone=US/Aleutian -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
      NOTE: reproduce with: ant test  -Dtestcase=XmlUpdateRequestHandlerTest -Dtests.method=testExternalEntities -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=hr -Dtests.timezone=America/Barbados -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
      NOTE: reproduce with: ant test  -Dtestcase=XmlUpdateRequestHandlerTest -Dtests.method=testNamedEntity -Dtests.seed=2F739D88D9C723CA -Dtests.slow=true -Dtests.locale=hr -Dtests.timezone=America/Barbados -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
      

      Attachments

        1. HADOOP-14501-branch-2.1.patch
          2 kB
          Jonathan Turner Eagles
        2. HADOOP-14501.5.patch
          6 kB
          Jonathan Turner Eagles
        3. HADOOP-14501.4-branch-2.patch
          6 kB
          Jonathan Turner Eagles
        4. HADOOP-14501.4.patch
          6 kB
          Jonathan Turner Eagles
        5. HADOOP-14501.3-branch-2.patch
          7 kB
          Jonathan Turner Eagles
        6. HADOOP-14501.3.patch
          7 kB
          Jonathan Turner Eagles
        7. HADOOP-14501.2.patch
          5 kB
          Jonathan Turner Eagles
        8. HADOOP-14501.1.patch
          2 kB
          Jonathan Turner Eagles

        Issue Links

          Activity

            People

              jeagles Jonathan Turner Eagles
              andrew.wang Andrew Wang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: