Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-324

row count incorrect if data file has more than 2^31 rows

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.7.0, 1.8.0
    • Fix Version/s: 1.8.0
    • Component/s: parquet-mr
    • Labels:
      None

      Description

      If a parquet file has more than 2^31 rows, the row count written into the file metadata is incorrect.
      The cause of the problem is the use of an int instead of long data type for numRows in ParquetMetadataConverter, toParquetMetadata:
      int numRows = 0;
      for (BlockMetaData block : blocks)

      { numRows += block.getRowCount(); addRowGroup(parquetMetadata, rowGroups, block); }

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tfriedr Thomas Friedrich
                Reporter:
                tfriedr Thomas Friedrich
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: