Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-341

Improve write performance with wide schema sparse data

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.9.0, 1.8.2
    • Component/s: None
    • Labels:
      None

      Description

      In write path, when there are tons of sparse data, most of time is spent on writing nulls.

      Currently writing nulls has the same code path as writing values, which is reclusive traverse all the leaves when a group is null.

      Due to the fact that when a group is null all the leaves beneath it should be written with null value with the same repetition level and definition level, we can eliminate the recursion call to get the leaves

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tianshuo Tim
                Reporter:
                tianshuo Tim
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: