Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20989

JDBC - The GetOperationStatus + log can block query progress via sleep()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0-alpha-1
    • HiveServer2
    • None

    Description

      There is an exponential sleep operation inside the CLIService which can end up adding tens of seconds to a query which has already completed.

      "HiveServer2-Handler-Pool: Thread-9373" #9373 prio=5 os_prio=0 tid=0x00007f4d5e72d800 nid=0xb634a waiting on condition [0x00007f28d06a5000]
      java.lang.Thread.State: TIMED_WAITING (sleeping)
      at java.lang.Thread.sleep(Native Method)
      at org.apache.hive.service.cli.CLIService.progressUpdateLog(CLIService.java:506)
      at org.apache.hive.service.cli.CLIService.getOperationStatus(CLIService.java:480)
      at org.apache.hive.service.cli.thrift.ThriftCLIService.GetOperationStatus(ThriftCLIService.java:695)
      at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1757)
      at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetOperationStatus.getResult(TCLIService.java:1742)
      at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
      at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
      at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
      at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      

      The sleep loop is on the server side.

      private static final long PROGRESS_MAX_WAIT_NS = 30 * 1000000000l;
      
      private JobProgressUpdate progressUpdateLog(boolean isProgressLogRequested, Operation operation, HiveConf conf) {
      ...
          long startTime = System.nanoTime();
          int timeOutMs = 8;
          try {
            while (sessionState.getProgressMonitor() == null && !operation.isDone()) {
              long remainingMs = (PROGRESS_MAX_WAIT_NS - (System.nanoTime() - startTime)) / 1000000l;
              if (remainingMs <= 0) {
                LOG.debug("timed out and hence returning progress log as NULL");
                return new JobProgressUpdate(ProgressMonitor.NULL);
              }
              Thread.sleep(Math.min(remainingMs, timeOutMs));
              timeOutMs <<= 1;
            }
      

      After about 16 seconds of execution of the query, the timeOutMs is 16384 ms, which means the next sleep cycle is for min(30 - 17, 16) = 13.

      If the query finishes on the 17th second, the JDBC server will only respond after the 30th second when it will check for operation.isDone() and return.

      Attachments

        1. HIVE-20989.01.patch
          4 kB
          Sankar Hariappan

        Activity

          People

            sankarh Sankar Hariappan
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: