Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2175

Skip method skips levels and not rows for repeated fields

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • parquet-cpp
    • None

    Description

      The implementation of TypedColumnReader::Skip method with signature:

      virtual int64_t Skip(int64_t num_levels_to_skip) = 0;

      will skip levels for both repeated fields and non-repeated fields. We want to be able to skip rows for repeated fields, and skipping levels is not that useful.

      For example, for the following rows:

      message M { repeated int32 b = 1 }

      rows: {}, {[10,10]}, {[20, 20, 20]}

      values = {10, 10, 20, 20, 20};
      def_levels = {0, 1, 1, 1, 1, 1};
      rep_levels = {0, 0, 1, 0, 1, 1};

      We want skip(2) to skip the first two rows, so that the next value that we read is 20. However, it will skip the first two levels, and the next value that we read is 10.

      Attachments

        Activity

          People

            Unassigned Unassigned
            panahi fatemah
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified