Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-8481

Ability to query XML root attributes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.21.1
    • 1.21.2
    • Storage - XML
    • None

    Description

      Hi,

      It is possible to retrieve the field attributes except those of the root
      It would be interesting to be able to retrieve the attributes found in the root node of XML files.
      In my common use cases, I have many XML files each containing a single XML frame with often one or more attributes in the root tag.
      To recover this value, I am currently forced to preprocess the files to "copy" this attribute into the fields of the XML record.

      Even with multiple xml records under the root, it would be useful to consider that the root attributes are accessible for each record

      Example (fichier aaa.xml):

      <PPP Version="2023-001" TimeStamp="2023-06-09T21:17:14.416+02:00">
      <P1 SubVersion="a1" MID="XX003" PN="156" SL="3"/>
      <P2 SubVersion="b1"><Color>blue</Color></P2>
      </PPP>
      

      With request :

      SELECT * FROM(SELECT filename, * FROM TABLE(dfs.test.`/aaa.xml`(type=>'xml', dataLevel=>1)) as xml) AS x;
      

      I can access to :

      • P1_SubVersion
      • P1_MID
      • P1_PN
      • P1_SL
      • P2_SubVersion
      • P2.Color

      But I can' access to :

      • PPP_Version
      • PPP_TimeStamp

      and changing the DataLevel does not solve the problem

      Regards,

      Attachments

        Activity

          People

            cgivre Charles Givre
            benj641 benj
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: