Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17481

LLAP workload management

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0
    • None
    • None

    Description

      This effort is intended to improve various aspects of cluster sharing for LLAP. Some of these are applicable to non-LLAP queries and may later be extended to all queries. Administrators will be able to specify and apply policies for workload management ("resource plans") that apply to the entire cluster, with only one resource plan being active at a time. The policies will be created and modified using new Hive DDL statements.
      The policies will cover:

      • Dividing the cluster into a set of (optionally, nested) query pools that are each allocated a fraction of the cluster, a set query parallelism, resource sharing policy between queries, and potentially others like priority, etc.
      • Mapping the incoming queries into pools based on the query user, groups, explicit configuration, etc.
      • Specifying rules that perform actions on queries based on counter values (e.g. killing or moving queries).
        One would also be able to switch policies on a live cluster without (usually) affecting running queries, including e.g. to change policies for daytime and nighttime usage patterns, and other similar scenarios. The switches would be safe and atomic; versioning may eventually be supported.

      Some implementation details:

      • WM will only be supported in HS2 (for obvious reasons).
      • All LLAP query AMs will run in "interactive" YARN queue and will be fungible between Hive pools.
      • We will use the concept of "guaranteed tasks" (also known as ducks) to enforce cluster allocation without a central scheduler and without compromising throughput. Guaranteed tasks preempt other (speculative) tasks and are distributed from HS2 to AMs, and from AMs to tasks, in accordance with percentage allocations in the policy. Each "duck" corresponds to a CPU resource on the cluster. The implementation will be isolated so as to allow different ones later.
      • In future, we may consider improved task placement and late binding, similar to the ones described in Sparrow paper, to work around potential hotspots/etc. that are not avoided with the decentralized scheme.
      • Only one HS2 will initially be supported to avoid split-brain workload management. We will also implement (in a tangential set of work items) active-passive HS2 recovery. Eventually, we intend to switch to full active-active HS2 configuration with shared WM and Tez session pool (unlike the current case with 2 separate session pools).

      Attachments

        1. Workload management design doc.pdf
          214 kB
          Sergey Shelukhin

        Issue Links

          1.
          implement workload management pools Sub-task Closed Sergey Shelukhin
          2.
          support LLAP workload management in HS2 (low level only) Sub-task Resolved Sergey Shelukhin
          3.
          add a notion of a guaranteed task to LLAP Sub-task Closed Sergey Shelukhin
          4.
          allow AM to use LLAP guaranteed tasks Sub-task Closed Sergey Shelukhin
          5.
          refactor TezSessionPoolManager to separate its multiple functions Sub-task Closed Sergey Shelukhin
          6.
          refactor LlapProtocolClientProxy to be usable with other protocols Sub-task Closed Sergey Shelukhin
          7.
          refactor LLAP ZK registry to make the ZK-registry part reusable Sub-task Closed Sergey Shelukhin
          8.
          implement Tez AM registry in Hive Sub-task Closed Sergey Shelukhin
          9.
          Implement global execution triggers based on counters Sub-task Closed Prasanth Jayachandran
          10.
          Create schema required for workload management. Sub-task Closed Harish JP
          11.
          Refactor WorkloadManager for accessing operations handles in service layer Sub-task Resolved Prasanth Jayachandran
          12.
          Enable JDBC + MiniLLAP tests in HIVE-17508 after HIVE-17566 Sub-task Resolved Unassigned
          13.
          Add trigger type to WM_TRIGGER table Sub-task Resolved Unassigned
          14.
          Implement commands to manage resource plan Sub-task Closed Harish JP
          15.
          Add support for custom counters in trigger expression Sub-task Closed Prasanth Jayachandran
          16.
          Implement per pool trigger validation and move sessions across pools Sub-task Closed Prasanth Jayachandran
          17.
          implement applying the resource plan Sub-task Closed Sergey Shelukhin
          18.
          Implement create, alter and drop workload management triggers Sub-task Closed Harish JP
          19.
          Display the reason for query cancellation Sub-task Closed Prasanth Jayachandran
          20.
          add notions of default pool and start adding unmanaged mapping Sub-task Closed Sergey Shelukhin
          21.
          implement query mapping to WM and non-WM based on policies Sub-task Resolved Sergey Shelukhin
          22.
          handle internal Tez AM restart in registry and WM Sub-task Closed Sergey Shelukhin
          23.
          propagate background LLAP cluster changes to WM Sub-task Closed Sergey Shelukhin
          24.
          use kill query mechanics to kill queries in WM Sub-task Closed Sergey Shelukhin
          25.
          enable and apply resource plan commands in HS2 Sub-task Closed Sergey Shelukhin
          26.
          Support triggers for non-pool sessions Sub-task Closed Prasanth Jayachandran
          27.
          Implement resource plan fetching from metastore Sub-task Closed Prasanth Jayachandran
          28.
          Implement pool, user, group and trigger to pool management API's. Sub-task Closed Harish JP
          29.
          add group support for pool mappings Sub-task Closed Sergey Shelukhin
          30.
          add explicit jdbc connection string args for mappings Sub-task Closed Sergey Shelukhin
          31.
          investigate deriving app name from JDBC connection for pool mapping Sub-task Closed Sergey Shelukhin
          32.
          print per query workload management trace after query execution Sub-task Resolved Unassigned
          33.
          Push resource plan changes to tez/unmanaged sessions Sub-task Closed Prasanth Jayachandran
          34.
          fix WM based on cluster smoke test; add logging Sub-task Closed Sergey Shelukhin
          35.
          beeline - support proper usernames based on the URL arg Sub-task Closed Sergey Shelukhin
          36.
          add HS2 jmx information about pools and current resource plan Sub-task Closed Sergey Shelukhin
          37.
          fix various WM bugs based on cluster testing - part 2 Sub-task Closed Sergey Shelukhin
          38.
          AM may assert when its guaranteed task count is reduced Sub-task Closed Sergey Shelukhin
          39.
          verify commands on a cluster Sub-task Closed Harish JP
          40.
          killquery doesn't actually work for non-trigger WM kills Sub-task Closed Sergey Shelukhin
          41.
          WM getSession needs some retry logic Sub-task Closed Sergey Shelukhin
          42.
          Add WM event traces at query level for debugging Sub-task Closed Prasanth Jayachandran
          43.
          add a unmanaged flag to triggers (applies to container based sessions) Sub-task Closed Sergey Shelukhin
          44.
          add a user-friendly show plan command Sub-task Closed Harish JP
          45.
          some alter resource plan fixes Sub-task Closed Sergey Shelukhin
          46.
          Idempotent state change for resource plan Sub-task Resolved Prasanth Jayachandran
          47.
          refactor reopen and file management in TezTask Sub-task Closed Sergey Shelukhin
          48.
          User mapping not initialized correctly on start Sub-task Closed Prasanth Jayachandran
          49.
          User mapping not initialized correctly on start Sub-task Resolved Prasanth Jayachandran
          50.
          Implement validate resource plan (part 1) Sub-task Closed Harish JP
          51.
          change the way WM is enabled and allow dropping the last resource plan Sub-task Closed Sergey Shelukhin
          52.
          add the unmanaged mapping command Sub-task Closed Sergey Shelukhin
          53.
          create plan like plan, and replace plan commands for easy modification Sub-task Closed Sergey Shelukhin
          54.
          validate resource plan - part 2 - validate action and trigger expressions Sub-task Resolved Harish JP
          55.
          implement scheduling policy configuration instead of hardcoding fair scheduling Sub-task Closed Sergey Shelukhin
          56.
          add LLAP-level counters for WM Sub-task Closed Sergey Shelukhin
          57.
          add AM level metrics for WM Sub-task Closed Sergey Shelukhin
          58.
          add HS2-level WM metrics Sub-task Closed Sergey Shelukhin
          59.
          clean up plugin between DAGs Sub-task Closed Sergey Shelukhin
          60.
          output mappings summary in plan description Sub-task Resolved Sergey Shelukhin
          61.
          use plan parallelism for the default pool if both are present Sub-task Closed Sergey Shelukhin
          62.
          WM RP: it's impossible to unset things Sub-task Closed Sergey Shelukhin
          63.
          improve show plan output (triggers, mappings) Sub-task Closed Sergey Shelukhin
          64.
          Workload manager initializes even when interactive queue is not set Sub-task Closed Prasanth Jayachandran

          Activity

            People

              sershe Sergey Shelukhin
              sershe Sergey Shelukhin
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: