Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10332

Add file formats to HdfsScanNode's thrift representation and codegen for those

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Implemented
    • None
    • Impala 4.0.0
    • Backend, Frontend
    • None
    • ghx-label-10

    Description

      List all file formats that a HdfsScanNode needs to process in any fragment instance. It is possible that some file formats will not be needed in all fragment instances.

      This is a step towards sharing codegen between different impala backends. Using the file formats provided in the thrift file, a backend can codegen code for file formats that are not needed in its own process but are needed in other fragment instances running on other backends, and the resulting binary can be shared between multiple backends.

      Codegenning for file formats will be done based on the thrift message and not on what is needed for the actual backend. This leads to some extra work in case a file format is not needed for the current backend and codegen sharing is not available (at this point it is not implemented). However, the overall number of such cases is low.

      Also adding the file formats to the node's explain string.

      Attachments

        Issue Links

          Activity

            People

              daniel.becker Daniel Becker
              daniel.becker Daniel Becker
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: