Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35743 Improve Parquet vectorized reader
  3. SPARK-38891

Skipping allocating vector for repetition & definition levels when possible

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0
    • 3.3.0, 3.4.0
    • SQL
    • None

    Description

      Currently the vectorized Parquet reader will allocate vectors for repetition and definition levels in all cases. However in certain cases (e.g., when reading primitive types) this is not necessary and should be avoided.

      Attachments

        Activity

          People

            csun Chao Sun
            csun Chao Sun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: