Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12992

Vectorize parquet decoding using ColumnarBatch

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      Parquet files benefit from vectorized decoding. ColumnarBatches have been designed to support this. This means that a single encoded parquet column is decoded to a single ColumnVector.

        Issue Links

          Activity

          Hide
          apachespark Apache Spark added a comment -

          User 'nongli' has created a pull request for this issue:
          https://github.com/apache/spark/pull/10908

          Show
          apachespark Apache Spark added a comment - User 'nongli' has created a pull request for this issue: https://github.com/apache/spark/pull/10908
          Hide
          davies Davies Liu added a comment -

          Issue resolved by pull request 10908
          https://github.com/apache/spark/pull/10908

          Show
          davies Davies Liu added a comment - Issue resolved by pull request 10908 https://github.com/apache/spark/pull/10908
          Hide
          nongli Nong Li added a comment -

          There's more work to do here.

          Show
          nongli Nong Li added a comment - There's more work to do here.
          Hide
          apachespark Apache Spark added a comment -

          User 'nongli' has created a pull request for this issue:
          https://github.com/apache/spark/pull/11055

          Show
          apachespark Apache Spark added a comment - User 'nongli' has created a pull request for this issue: https://github.com/apache/spark/pull/11055
          Hide
          davies Davies Liu added a comment -

          Nong Li We usually have one PR for one JIRA, the JIRA will be closed as resolved by the merge tools automatically.

          Show
          davies Davies Liu added a comment - Nong Li We usually have one PR for one JIRA, the JIRA will be closed as resolved by the merge tools automatically.
          Hide
          davies Davies Liu added a comment -

          Issue resolved by pull request 11055
          https://github.com/apache/spark/pull/11055

          Show
          davies Davies Liu added a comment - Issue resolved by pull request 11055 https://github.com/apache/spark/pull/11055

            People

            • Assignee:
              nongli Nong Li
              Reporter:
              nongli Nong Li
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development