Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-4704

Presplit index tables when building asynchronously

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.14.0, 5.0.0
    • Labels:
      None

      Description

      For large data tables with many regions, if we build the index asynchronously using the IndexTool, the index table will initial face a hotspot as all data region mappers attempt to write to the sole new index region.  This can potentially lead to the index getting disabled if writes to the index table timeout during this hotspotting.

      We can add an optional step (or perhaps activate it based on the count of regions in the data table) to the IndexTool to first do a MR job to gather stats on the indexed column values, and then attempt to presplit the index table before we do the actual index build MR job.

        Attachments

        1. PHOENIX-4704.master.v1.patch
          20 kB
          Vincent Poon
        2. PHOENIX-4704.master.v2.patch
          21 kB
          Vincent Poon

          Issue Links

            Activity

              People

              • Assignee:
                vincentpoon Vincent Poon
                Reporter:
                vincentpoon Vincent Poon
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: