Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9134 Uber JIRA to track HOS performance work
  3. HIVE-8853

Make vectorization work with Spark [Spark Branch]

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.1.0
    • Spark
    • None

    Description

      In Hive to make vectorization work, the reader needs to be also vectorized, which means that the reader can read a chunk of rows (or a list of column chunks) instead of one row at a time. However, we use Spark RDD for reading, which again utilized the underlying inputformat to read. Subsequent processing also needs to hapen in batches. We need to make sure that vectorizatoin is working as expected.

      Attachments

        Issue Links

          Activity

            People

              jxiang Jimmy Xiang
              xuefuz Xuefu Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: