Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3856

Enhance Scan costing to include factors other than row count

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.1.0
    • None
    • None

    Description

      The costing of Scans in DrillScanRel and ScanPrel's computeSelfCost() method currently computes the cpu cost as a function of row count and column count only. This works fine as long as there is a single type of Scan plan.

      With the new addition of the native reader for Hive parquet tables, there are 2 ways to do the same scan: a HiveScan and a Drill native scan. Both scans produce the same row count, so there should be a way to differentiate between the two. The CPU and memory cost of the Drill native scan is expected to be lower than HiveScan, hence these factors need to be included in the costing.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              amansinha100 Aman Sinha
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: