Uploaded image for project: 'Apache Trafodion (Retired)'
  1. Apache Trafodion (Retired)
  2. TRAFODION-2700

Query that selects only a single salt value gets parallel plan

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1-incubating
    • None
    • sql-cmp
    • None
    • any

    Description

      For some queries we saw parallel plans where the parallelism didn't really help, because the WHERE predicate selected only a single salt values. The overhead isn't huge, but it can add up when executing many such queries.

      Example:

      create table ts(a integer not null primary key, b char(2000)) salt using 4 partitions;
      explain select count from ts <<+ cardinality 1e7>> where a =1;

      The problem, I think, is in method SimpleFileScanOptimizer::scmComputeCostVectorsForHbase(), file core/sql/optimizer/ScmCostMethod.cpp. This computes separate degrees of parallelism for the region server side and the client side and scales the costs incurred on each side separately.

      However, if there are more ESPs (clients) than regions, some ESPs have nothing to do, limiting the parallelism. On the other hand, if there are more regions than ESPs, each ESP reads regions sequentially, so that limits the DoP on the region server side.

      Therefore, my suggested fix is to use the minimum of those two DoPs to compute the cost.

      Attachments

        Activity

          People

            hzeller Hans Zeller
            hzeller Hans Zeller
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: