Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5628 Parquet support for additional valid decimal representations
  3. IMPALA-2515

Impala rejects Parquet schemas where decimal fixed_len_byte_array columns have unnecessary padding bytes

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.3.0
    • Fix Version/s: Impala 4.0
    • Component/s: Backend
    • Labels:

      Description

      Impala cannot read this:

      {"name": "tmp_1",
       "type": "fixed",
       "size": 8,
       "logicalType": "decimal",
       "precision": 10,
       "scale": 5}
      

      However, this can be read:

      {"name": "tmp_1",
       "type": "fixed",
       "size": 5,
       "logicalType": "decimal",
       "precision": 10,
       "scale": 5}
      

      Size must be precisely set to this, or Impala is unable to read the decimal column:

      size = int(math.ceil((math.log(2, 10) + precision) / math.log(256, 10)))
      

      There is nothing in the Parquet spec that says that Decimal columns must be sized precisely. Arguably it's a bug in the writer if it's doing it, because it's just wasting space.
      https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal

        Attachments

        1. image-2020-02-07-11-48-35-179.png
          35 kB
          Onur Tokat
        2. image-2020-02-07-11-36-31-458.png
          21 kB
          Onur Tokat
        3. image-2020-02-07-11-34-04-220.png
          33 kB
          Onur Tokat
        4. image-2020-02-07-11-33-27-944.png
          21 kB
          Onur Tokat
        5. image-2020-02-07-11-31-43-641.png
          42 kB
          Onur Tokat
        6. image-2020-02-07-11-31-38-074.png
          42 kB
          Onur Tokat

          Issue Links

            Activity

              People

              • Assignee:
                tarmstrong Tim Armstrong
                Reporter:
                tarasbob Taras Bobrovytsky
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: