Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.6.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:
      None

      Description

      Since we will multi-thread query execution at the fragment level, we should rework KuduScanNode to only use a single thread (the one that's executing the fragment).

        Activity

        Hide
        joemcdonnell Joe McDonnell added a comment -

        commit 5bb988b1c58a2377777312c8e4dd56cbd0dee8a2
        Author: Joe McDonnell <joemcdonnell@cloudera.com>
        Date: Fri Mar 3 17:47:23 2017 -0800

        IMPALA-4996: Single-threaded KuduScanNode

        This introduces KuduScanNodeMt, the single-threaded version
        of KuduScanNode that materializes the tuples in GetNext().
        KuduScanNodeMt is enabled by the same condition as
        HdfsScanNodeMt: mt_dop is greater than or equal to 1.

        To share code between the two implementations, KuduScanNode
        and KuduScanNodeMt are now subclasses of KuduScanNodeBase,
        which implements the shared code. The KuduScanner is
        minimally impacted, as it already had the required GetNext
        interface.

        Since the KuduClient is a heavy-weight object, it is now
        shared at the QueryState level. We try to share the
        KuduClient as much as possible, but there are times when
        the KuduClient cannot be shared. Each Kudu table has
        master addresses stored in the Hive Metastore. We only
        share KuduClients for tables that have an identical value
        for the master addresses. In the ideal case, every Kudu
        table will have the same value, but there is no explicit
        guarantee of this.

        The testing for this is a modified version of
        kudu-scan-node.test run with various mt_dop values.

        Change-Id: I6e4593300e376bc508b78acaea64ffdd2c73a67a
        Reviewed-on: http://gerrit.cloudera.org:8080/6312
        Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
        Tested-by: Impala Public Jenkins

        Show
        joemcdonnell Joe McDonnell added a comment - commit 5bb988b1c58a2377777312c8e4dd56cbd0dee8a2 Author: Joe McDonnell <joemcdonnell@cloudera.com> Date: Fri Mar 3 17:47:23 2017 -0800 IMPALA-4996 : Single-threaded KuduScanNode This introduces KuduScanNodeMt, the single-threaded version of KuduScanNode that materializes the tuples in GetNext(). KuduScanNodeMt is enabled by the same condition as HdfsScanNodeMt: mt_dop is greater than or equal to 1. To share code between the two implementations, KuduScanNode and KuduScanNodeMt are now subclasses of KuduScanNodeBase, which implements the shared code. The KuduScanner is minimally impacted, as it already had the required GetNext interface. Since the KuduClient is a heavy-weight object, it is now shared at the QueryState level. We try to share the KuduClient as much as possible, but there are times when the KuduClient cannot be shared. Each Kudu table has master addresses stored in the Hive Metastore. We only share KuduClients for tables that have an identical value for the master addresses. In the ideal case, every Kudu table will have the same value, but there is no explicit guarantee of this. The testing for this is a modified version of kudu-scan-node.test run with various mt_dop values. Change-Id: I6e4593300e376bc508b78acaea64ffdd2c73a67a Reviewed-on: http://gerrit.cloudera.org:8080/6312 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: Impala Public Jenkins

          People

          • Assignee:
            joemcdonnell Joe McDonnell
            Reporter:
            alex.behm Alexander Behm
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development