Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-2341

GremlinClientExtensions.SubmitAsync hangs as it tries to dispose connection

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.4.3
    • Fix Version/s: None
    • Component/s: dotnet
    • Labels:
      None
    • Flags:
      Important

      Description

      We have been seeing occasional hangs when logging to Cosmos DB graph using the Gremlin.Net NuGet (version 3.4.3).

       

      Here's our code snippet: 

          var results = await GremlinClientExtensions.SubmitAsync<dynamic>(this.Client, gremlinExpression).ConfigureAwait(false);

       

          foreach (var result in results)

         

      {         Tracer.TraceMessage()(             traceId,             "Successfully logged activity result (ElapsedMilliseconds=\{0}

      ): '{1}'. Request charge is '{2}'.",

                  sw.ElapsedMilliseconds,

                  JsonConvert.SerializeObject(result),

                  this.GetRequestCharge(results.StatusAttributes));

          }

       

      From the dumps, it seems the connection to the DB may have hit some failure:

       0:000> !PrintException /d 000001c61a0fd2c8
      Exception object: 000001c61a0fd2c8
      Exception type: System.Net.WebSockets.WebSocketException
      Message: An internal WebSocket error occurred. Please see the innerException, if present, for more details.
      InnerException: System.IO.IOException, Use !PrintException 000001c61a0fa760 to see more.
      StackTrace (generated):
      SP IP Function
      000000D5B446CA20 00007FFD9263D383 System_ni!System.Net.WebSockets.WebSocketBase.ThrowIfConvertibleException(System.String, System.Exception, System.Threading.CancellationToken, Boolean)+0x483
      000000D5B446CAC0 00007FFD9203372B System_ni!System.Net.WebSockets.WebSocketBase+<ReceiveAsyncCore>d__45.MoveNext()+0x56b
      000000D5B446A360 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
      000000D5B446A390 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
      000000D5B446A3C0 00007FFD35D7B46C Gremlin_Net!Gremlin.Net.Driver.WebSocketConnection+<ReceiveMessageAsync>d__9.MoveNext()+0x20c
      000000D5B4467CA0 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
      000000D5B4467CD0 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
      000000D5B4467D00 00007FFD35D7A96F Gremlin_Net!Gremlin.Net.Driver.Connection+<ReceiveMessagesAsync>d__20.MoveNext()+0x16f
      000000D3EF8FE770 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
      000000D3EF8FE7A0 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
      000000D3EF8FE7D0 00007FFD35D8B1BC Gremlin_Net!Gremlin.Net.Driver.ProxyConnection+<SubmitAsync>d__3`1[[System.__Canon, mscorlib]].MoveNext()+0x15c
      000000D3EF8FC0D0 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
      000000D3EF8FC100 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
      000000D3EF8FC130 00007FFD35D8887C Gremlin_Net!Gremlin.Net.Driver.GremlinClient+<SubmitAsync>d__6`1[[System.__Canon, mscorlib]].MoveNext()+0x2ac

      StackTraceString: <none>
      HResult: 80004005
      0:000> !PrintException /d 000001c61a0fa760
      Exception object: 000001c61a0fa760
      Exception type: System.IO.IOException
      Message: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
      InnerException: System.Net.Sockets.SocketException, Use !PrintException 000001c61a0fa570 to see more.
      StackTrace (generated):
      SP IP Function
      000000D5B447A610 00007FFD928993C3 System_ni!System.Net.Security._SslStream.EndRead(System.IAsyncResult)+0x7f3873
      000000D5B447A660 00007FFD920A5B4C System_ni!System.Net.TlsStream.EndRead(System.IAsyncResult)+0x10c
      000000D5B447A6B0 00007FFD94043F13 mscorlib_ni!System.Threading.Tasks.TaskFactory`1+FromAsyncTrimPromise`1[[System.Int32, mscorlib],[System.__Canon, mscorlib]].Complete(System._Canon, System.Func`3<System._Canon,System.IAsyncResult,Int32>, System.IAsyncResult, Boolean)+0x23
      000000D5B4475DD0 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
      000000D5B4475E00 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
      000000D5B4475E30 00007FFD92035D25 System_ni!System.Net.WebSockets.WebSocketConnectionStream+<ReadAsync>d__21.MoveNext()+0x385
      000000D5B4471430 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
      000000D5B4471460 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
      000000D5B4471490 00007FFD92034541 System_ni!System.Net.WebSockets.WebSocketBase+WebSocketOperation+<Process>d__19.MoveNext()+0x8e1
      000000D5B446ED40 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
      000000D5B446ED70 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
      000000D5B446EDA0 00007FFD9203339E System_ni!System.Net.WebSockets.WebSocketBase+<ReceiveAsyncCore>d__45.MoveNext()+0x1de

      StackTraceString: <none>
      HResult: 80131620
      0:000> !PrintException /d 000001c61a0fa570
      Exception object: 000001c61a0fa570
      Exception type: System.Net.Sockets.SocketException
      Message: An existing connection was forcibly closed by the remote host
      InnerException: <none>
      StackTrace (generated):
      SP IP Function
      000000D5B447ED00 00007FFD928975F7 System_ni!System.Net.Sockets.Socket.EndReceive(System.IAsyncResult)+0x7fe207
      000000D5B447ED40 00007FFD9209924D System_ni!System.Net.Sockets.NetworkStream.EndRead(System.IAsyncResult)+0x2d

       

       

      But then the Gremlin.Net 's exception handling to dispose and close connections seemed to hit some deadlock:

      0:000> ~26e !clrstack

      OS Thread Id: 0x31b8 (26)

              Child SP               IP Call Site

      000000d3ef8f8798 00007ffda80b6594 [GCFrame: 000000d3ef8f8798]

      000000d3ef8f8868 00007ffda80b6594 [HelperMethodFrame_1OBJ: 000000d3ef8f8868] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)

      000000d3ef8f8980 00007ffd935f6a0f System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\ManualResetEventSlim.cs @ 669]

      000000d3ef8f8a10 00007ffd935ef815 System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3320]

      000000d3ef8f8a80 00007ffd93f60a88 System.Threading.Tasks.Task.InternalWait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3259]

      000000d3ef8f8b50 00007ffd935eef08 System.Threading.Tasks.Task.Wait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3167]

      000000d3ef8f8b80 00007ffd35d53f7f Gremlin.Net.Process.Utils.WaitUnwrap(System.Threading.Tasks.Task)

      000000d3ef8f8bc0 00007ffd35db01dc Gremlin.Net.Driver.ConnectionPool.ReturnConnectionIfOpen(Gremlin.Net.Driver.Connection)

      000000d3ef8f8bf0 00007ffd35d88937 Gremlin.Net.Driver.GremlinClient+d__6`1[[System.__Canon, mscorlib]].MoveNext()

      000000d3ef8fbfe8 00007ffd94756f21 [HelperMethodFrame: 000000d3ef8fbfe8]

      000000d3ef8fc0d0 00007ffd93f7c3df System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() [f:\dd\ndp\clr\src\BCL\system\runtime\exceptionservices\exceptionservicescommon.cs @ 133]

      000000d3ef8fc100 00007ffd935edf2c System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\TaskAwaiter.cs @ 156]

      000000d3ef8fc130 00007ffd35d8887c Gremlin.Net.Driver.GremlinClient+d__6`1[[System.__Canon, mscorlib]].MoveNext()

      000000d3ef8fc1e0 00007ffd935a5ab3 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]

      000000d3ef8fc2b0 00007ffd935a5944 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]

      000000d3ef8fc2e0 00007ffd935f12bc System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run() [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\AsyncMethodBuilder.cs @ 1070]

      000000d3ef8fc330 00007ffd935f80b2 System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\TaskContinuation.cs @ 810]

      000000d3ef8fc380 00007ffd93573dc7 System.Threading.Tasks.Task.FinishContinuations() [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3617]

      000000d3ef8fc410 00007ffd935ee890 System.Threading.Tasks.Task.Finish(Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2263]

      000000d3ef8fc470 00007ffd935f0101 System.Threading.Tasks.Task`1[[System.__Canon, mscorlib]].TrySetException(System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Future.cs @ 606]

      000000d3ef8fc4c0 00007ffd935f06a8 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].SetException(System.Exception)

      000000d3ef8fc500 00007ffd35d8b27d Gremlin.Net.Driver.ProxyConnection+d__3`1[[System.__Canon, mscorlib]].MoveNext()

      000000d3ef8fe688 00007ffd94756f21 [HelperMethodFrame: 000000d3ef8fe688]

      000000d3ef8fe770 00007ffd93f7c3df System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() [f:\dd\ndp\clr\src\BCL\system\runtime\exceptionservices\exceptionservicescommon.cs @ 133]

      000000d3ef8fe7a0 00007ffd935edf2c System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\TaskAwaiter.cs @ 156]

      000000d3ef8fe7d0 00007ffd35d8b1bc Gremlin.Net.Driver.ProxyConnection+d__3`1[[System.__Canon, mscorlib]].MoveNext()

      000000d3ef8fe860 00007ffd935a5ab3 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]

      000000d3ef8fe930 00007ffd935a5944 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]

      000000d3ef8fe960 00007ffd935f12bc System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run() [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\AsyncMethodBuilder.cs @ 1070]

      000000d3ef8fe9b0 00007ffd935e4e22 System.Threading.ThreadPoolWorkQueue.Dispatch() [f:\dd\ndp\clr\src\BCL\system\threading\threadpool.cs @ 820]

      000000d3ef8fee48 00007ffd94646a93 [DebuggerU2MCatchHandlerFrame: 000000d3ef8fee48]

       

      Looks like Gremlin.Net handles a dead connection by attempting to forcing a closure of the connection pool.  This path calls Task.Wait() instead of using await Task, so it is very possible that this is the cause of the hang.

       

      Might be related to this issue:

      https://issues.apache.org/jira/projects/TINKERPOP/issues/TINKERPOP-2268?filter=allopenissues&orderby=affectedVersion+DESC%2C+priority+DESC%2C+updated+DESC

       

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              wll Wilson Li
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: