Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Balancer
    • None

    Description

      In production, we have seen some critical big tables that handle majority of the load. Table Skew is becoming more important. With the update of table skew function, balancer finally works for large table distribution on large cluster. We should increase the weight from 35 to a level comparable to region count skew: 500. We can even push further to replace region count skew by table skew since the latter works in the same way and account for region distribution per node.

      Another weight we found helpful to increase is for store file size cost function. Ideally if normalizer works perfectly, we don't need to worry about it since region count skew would have accounted for it. But we are often in a situation it doesn't. Store file distribution needs to be given more way as accommodation. we tested changing it from 5 to 200 and it works fine.

      Attachments

        Activity

          People

            Unassigned Unassigned
            claraxiong Clara Xiong
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: