Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-6089

SQL: Improve query parallelism architecture

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1
    • None
    • sql

    Description

      Currently query parallelism implement with static split of all indexes (including PK) for cache. This approach has several major disadvantages:
      1) It improves scans, but slows down index and range lookups
      2) Tables with different DOP cannot be used in the same query

      We need to fix that. Proposed plan:
      1) No more index splits, ever - there is one and only one index always
      2) Use preliminary execution plan, statistics (IGNITE-6079), CPU cores count and CPU load to estimate whether query will benefit from parallelism.
      3) if yes - split node-s single map query into several independent pieces.

      Splitting can be achieved in one of the following ways:
      1) Partition-based: e.g. if node owns partitions A, B, C and D, then we can split it to two queries - one over (A, B), another over (C, D). This could be useful for pure scans (e.g. DWH)
      2) Histogram-based: e.g. if we have a query SELECT ... WHERE salary > 50, and we know salary distribution, we can split it into WHERE salary > 50 AND salary <= 200 and WHERE salary > 200

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vozerov Vladimir Ozerov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: