Uploaded image for project: 'IMPALA'
  2. IMPALA-4765

Catalog loading threads can be wasted waiting for a large table to load




      When there are multiple requests to the catalogd to prioritize loading the same table, then several catalog loading threads may end up waiting for that single table to be loaded, effectively reducing the number of catalog loading threads. In extreme examples, this might degrade to serial loading of tables.

      Note that even a single query may issue multiple table-loading requests even for the same table if the table is very big. After issuing a load request, an impalad will wait 2m for the metadata to arrive, and then send the request again every 2m. So if a large table takes 20m to load, then a single query could issue 10 table-loading requests which ultimately hog 10 table-loading threads in the catalogd.

      The simplest way to diagnose the issue is to examine the jstack of the catalogd and then you might discover several stacks that look like this:

         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x0000000502e8c998> (a java.util.concurrent.FutureTask) <--- see if several threads are waiting on the same FutureTask
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
      	at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:187)
      	at org.apache.impala.catalog.TableLoadingMgr$LoadRequest.get(TableLoadingMgr.java:72)
      	at org.apache.impala.catalog.CatalogServiceCatalog.getOrLoadTable(CatalogServiceCatalog.java:738)
      	at org.apache.impala.catalog.TableLoadingMgr.loadNextTable(TableLoadingMgr.java:288)
      	at org.apache.impala.catalog.TableLoadingMgr.access$600(TableLoadingMgr.java:50)
      	at org.apache.impala.catalog.TableLoadingMgr$3.run(TableLoadingMgr.java:259)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)

      The buggy code can be found in TableLoadingMgr.java:

        private void loadNextTable() throws InterruptedException {
          // Always get the next table from the head of the deque.
          final TTableName tblName = tableLoadingDeque_.takeFirst();
          if (LOG.isTraceEnabled()) {
            LOG.trace("Loading next table. Remaining items in queue: "
                + tableLoadingDeque_.size());
          try {
            // TODO: Instead of calling "getOrLoad" here we could call "loadAsync". We would
            // just need to add a mechanism for moving loaded tables into the Catalog.
            catalog_.getOrLoadTable(tblName.getDb_name(), tblName.getTable_name());
          } catch (CatalogException e) {
            // Ignore.

      Notice that the first few lines are intended to avoid loading the same table multiple times. However, the code does not prevent multiple threads from entering Catalog.getTableOrLoad() which will block on the same future for the same table.

      The issue is easy to reproduce by simulating a long table load and doing several concurrent loads of the same table from an impalad. For example, you can first "invalidate metadata t" and then "desc t" several times concurrently.

      A slow table loading can be simulated by adding a sleep inside call() function of the FutureTask created in TableLoadingMgr.loadAsync().




            alex.behm Alexander Behm
            alex.behm Alexander Behm
            0 Vote for this issue
            6 Start watching this issue