Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10466

Handle deprecated TWO_LEVEL Parquet arrays more gracefully

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • ghx-label-9

    Description

      The default of PARQUET_ARRAY_RESOLUTION was changed from TWO_LEVEL_THEN_THREE_LEVEL to THREE_LEVEL in IMPALA-4725. This solved incorrectly detecting some ambiguous cases, but now old TWO_LEVEL Parquet lists are not read correctly by default, replacing values with NULL without any error message.

      I would prefer a solution that:
      a. detects the correct resolution when possible
      b. returns a clear warning/error when resolution is not possible or ambiguous, and points the user toward the query option that needs to be set manually

      Attachments

        Issue Links

          Activity

            People

              csringhofer Csaba Ringhofer
              csringhofer Csaba Ringhofer
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: