Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3270

Modify HbaseGroupScan to allow multiple fragments per region

    XMLWordPrintableJSON

    Details

      Description

      When performing a full HBASE or MapR-DB table scan using drill, it is observed within the resulting query profile that only one minor fragment is assigned per region, regardless of the size of the region. In the case of extremely large regions, especially if there are regions of mismatching sizes, this can result in poor performance and a low degree of parallelism.

      One possible option (mentioned by sphillips) is to lazily compute the splits by assuming that the keys within a given region were evenly distributed (not perfect, but better than nothing), and perhaps have a 'max-frags-per-region' setting.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              andypern Andy Pernsteiner
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: