Uploaded image for project: 'Livy'
  1. Livy
  2. LIVY-388

Livy should expose server.connect.timeout in Rest Api

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.3
    • Fix Version/s: None
    • Component/s: Core
    • Labels:
      None

      Description

      Consider a Node which has NN1 (NameNode) and HiveMetaStore is down but we have HA for both services. Running livy script will create a new session and will wait for ipc.client.connect.timeout (20s) for each jar upload into hdfs

      17/07/31 13:59:29 INFO ContextLauncher: 17/07/31 13:59:39 INFO Client: Source and destination file systems are the same. Not copying hdfs://prabhu/hdp/apps/2.6.1.0-129/spark/spark-hdp-assembly.jar
      17/07/31 13:59:49 INFO ContextLauncher: 17/07/31 13:59:49 INFO Client: Uploading resource file:/usr/hdp/current/livy-server/rsc-jars/livy-rsc-0.3.0.2.6.1.0-129.jar -> hdfs://prabhu/user/diasmi/.sparkStaging/application_1501501991083_0001/livy-rsc-0.3.0.2.6.1.0-129.jar
      

      and 5 seconds (hive.metastore.client.socket.timeout)

      17/07/26 09:09:46 INFO ContextLauncher: 17/07/26 09:09:46 INFO metastore: Trying to connect to metastore with URI thrift://prabhu01:9083
      17/07/26 09:09:51 INFO ContextLauncher: 17/07/26 09:09:51 WARN metastore: Failed to connect to the MetaStore Server...
      17/07/26 09:09:51 INFO ContextLauncher: 17/07/26 09:09:51 INFO metastore: Trying to connect to metastore with URI thrift://prabhu02:9083
      17/07/26 09:09:51 INFO ContextLauncher: 17/07/26 09:09:51 INFO metastore: Connected to metastore.
      

      and finally will fail with timeout with Livy Server Connect Timeout. 90 Seconds is too low for this case. RPC_CLIENT_HANDSHAKE_TIMEOUT("server.connect.timeout", "90s"). This should be exposed Via Rest API for other components like Zeppelin to Override it.

      17/07/31 14:00:51 ERROR RSCClient: Failed to connect to context.
      java.util.concurrent.TimeoutException: Timed out waiting for context to start.
              at com.cloudera.livy.rsc.ContextLauncher.connectTimeout(ContextLauncher.java:133)
              at com.cloudera.livy.rsc.ContextLauncher.access$200(ContextLauncher.java:62)
              at com.cloudera.livy.rsc.ContextLauncher$2.run(ContextLauncher.java:121)
              at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
              at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
              at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
              at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
              at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
              at java.lang.Thread.run(Thread.java:745)
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Prabhu Joseph Prabhu Joseph
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: