Uploaded image for project: 'Axiom'
  1. Axiom
  2. AXIOM-478

Problems parsing large XML

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.2.17
    • 1.2.18
    • LLOM
    • None

    Description

      This is LU Jie from IBM. We use axiom to parse Atom in our project.
      One of our CMIS API will attach file content to the XML. If the file size is large, we will get a large atom.
      If we use Entry.getExtension(QName) to parse the content, it will allocate a large memory(around 5-6 times of the file size).
      We need you help to clarify if we can use DOM-like API of axiom to get the text of a certain element as stream. That is without allocating a large object in memory.
      Or is there an alternative solution for this use case?
      We DO know that we can use pull-parser to parse the XML as stream. But we need help to investigate if axiom has already provided an API or solution to avoid writing parser by ourselves.

      Here's the sample XML. We need to parse the text of cmisra:base64 element:

      <atom:entry
          xmlns:atom="http://www.w3.org/2005/Atom"
          xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/"
          xmlns:chemistry="http://chemistry.apache.org/"
          xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/">
          <atom:id
              xmlns:atom="http://www.w3.org/2005/Atom">urn:uuid:00000000-0000-0000-0000-00000000000
          </atom:id>
          <atom:title
              xmlns:atom="http://www.w3.org/2005/Atom" type="text">doucment1446016556658.txt
          </atom:title>
          <atom:updated
              xmlns:atom="http://www.w3.org/2005/Atom">2015-10-28T07:15:57.594Z
          </atom:updated>
          <cmisra:content
              xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/">
              <cmisra:mediatype
                  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/">text/plain
              </cmisra:mediatype>
              <chemistry:filename
                  xmlns:chemistry="http://chemistry.apache.org/">doucment1446016556658.txt
              </chemistry:filename>
              <cmisra:base64
                  xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/">Base64 encoded content of large file
              </cmisra:base64>
          </cmisra:content>
          <cmisra:object
              xmlns:cmisra="http://docs.oasis-open.org/ns/cmis/restatom/200908/">
              <cmis:properties
                  xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/">
                  <cmis:propertyId
                      xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/" propertyDefinitionId="cmis:objectTypeId">
                      <cmis:value
                          xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/">snx:file
                      </cmis:value>
                  </cmis:propertyId>
                  <cmis:propertyString
                      xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/" propertyDefinitionId="cmis:name">
                      <cmis:value
                          xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/">doucment1446016556658.txt
                      </cmis:value>
                  </cmis:propertyString>
              </cmis:properties>
          </cmisra:object>
      </atom:entry>
      

      Attachments

        Activity

          People

            veithen Andreas Veithen
            lujie LU Jie
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: