Uploaded image for project: 'Guacamole'
  1. Guacamole
  2. GUACAMOLE-86

Connection closed when printing large content in SSH

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.9.9
    • Fix Version/s: 0.9.11-incubating
    • Component/s: Terminal
    • Labels:
      None

      Description

      Guacd is crashing when running commands with big output on stdout from web ui

      Aug 25 07:10:34 ip-10-0-9-145 guacd[27432]: SSH connection successful.
      Aug 25 07:10:56 ip-10-0-9-145 server: 07:10:56.646 [Thread-5] ERROR o.g.g.w.GuacamoleWebSocketTunnelEndpoint - Connection to guacd terminated abnormally: Connection to guacd timed out.
      Aug 25 07:10:56 ip-10-0-9-145 server: 07:10:56.646 [Thread-5] DEBUG o.g.g.w.GuacamoleWebSocketTunnelEndpoint - Internal error during connection to guacd.
      Aug 25 07:10:56 ip-10-0-9-145 server: org.glyptodon.guacamole.GuacamoleUpstreamTimeoutException: Connection to guacd timed out.
      Aug 25 07:10:56 ip-10-0-9-145 server: 07:10:56.653 [Thread-5] DEBUG o.g.g.net.InetGuacamoleSocket - Closing socket to guacd.
      Aug 25 07:11:13 ip-10-0-9-145 server: 07:11:13.025 [http-bio-127.0.0.1-8080-exec-5] DEBUG o.g.g.net.InetGuacamoleSocket - Connecting to guacd at localhost:4822.
      
      Wed Aug 24 19:05:37 2016] traps: guacd[29476] general protection ip:7f288455a24c sp:7f2874e53120 error:0 in libc-2.17.so[7f28844da000+1b7000]
      tree /
      rw-rr- 1 root root 20M Aug 25 06:20 demo.txt
      

        Activity

        Hide
        nischay.rathor Nischay Rathor added a comment -
        Show
        nischay.rathor Nischay Rathor added a comment - Related to - https://glyptodon.org/jira/browse/GUAC-1223
        Hide
        nischay.rathor Nischay Rathor added a comment -

        System Info.
        1. AWS EC2 Instance - 2 core 8 GB RAM
        2. Centos 7 3.10.0-327.28.3.el7.x86_64
        3. guacamole clinet and server version is 0.9.9.

        Show
        nischay.rathor Nischay Rathor added a comment - System Info. 1. AWS EC2 Instance - 2 core 8 GB RAM 2. Centos 7 3.10.0-327.28.3.el7.x86_64 3. guacamole clinet and server version is 0.9.9.
        Hide
        nischay.rathor Nischay Rathor added a comment - - edited

        More logs when running guacd in debug mode

        Aug 25 08:52:38 ip-10-0-9-145 server: 08:52:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Checking for expired sessions...
        Aug 25 08:52:38 ip-10-0-9-145 server: 08:52:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Session check completed in 0 ms.
        Aug 25 08:52:41 ip-10-0-9-145 server: 08:52:41.354 [http-bio-127.0.0.1-8080-exec-2] DEBUG o.g.g.net.InetGuacamoleSocket - Connecting to guacd at localhost:4444.
        Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Protocol "ssh" selected
        Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Connection ID is "$0978da36-0a87-4b07-af7c-1d715b700cdc"
        Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Current locale does not use UTF-8. Some characters may not render correctly.
        Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Starting client
        Aug 25 08:52:41 ip-10-0-9-145 server: 08:52:41.406 [http-bio-127.0.0.1-8080-exec-2] INFO  o.g.g.net.basic.TunnelRequestService - User "demo" connected to connection "ssh connection".
        Aug 25 08:52:53 ip-10-0-9-145 sshd[2183]: error: Could not load host key: /etc/ssh/ssh_host_dsa_key
        Aug 25 08:52:53 ip-10-0-9-145 systemd-logind: New session 36 of user abc.
        Aug 25 08:52:53 ip-10-0-9-145 systemd: Started Session 36 of user abc.
        Aug 25 08:52:53 ip-10-0-9-145 systemd: Starting Session 36 of user abc.
        Aug 25 08:52:53 ip-10-0-9-145 guacd[2170]: SSH connection successful.
        Aug 25 08:52:53 ip-10-0-9-145 dbus[738]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
        Aug 25 08:52:53 ip-10-0-9-145 dbus-daemon: dbus[738]: [system] Activating service name='org.freedesktop.problems' (using servicehelper)
        Aug 25 08:52:53 ip-10-0-9-145 dbus[738]: [system] Successfully activated service 'org.freedesktop.problems'
        Aug 25 08:52:53 ip-10-0-9-145 dbus-daemon: dbus[738]: [system] Successfully activated service 'org.freedesktop.problems'
        Aug 25 08:53:19 ip-10-0-9-145 consul: 2016/08/25 08:53:19 [INFO] memberlist: Marking nqa-apihub-dataflow1-1-1b as failed, suspect timeout reached
        Aug 25 08:53:19 ip-10-0-9-145 consul[846]: memberlist: Marking nqa-apihub-dataflow1-1-1b as failed, suspect timeout reached
        Aug 25 08:53:19 ip-10-0-9-145 consul[846]: serf: EventMemberFailed: nqa-apihub-dataflow1-1-1b 10.0.11.69
        Aug 25 08:53:19 ip-10-0-9-145 consul: 2016/08/25 08:53:19 [INFO] serf: EventMemberFailed: nqa-apihub-dataflow1-1-1b 10.0.11.69
        Aug 25 08:53:38 ip-10-0-9-145 server: 08:53:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Checking for expired sessions...
        Aug 25 08:53:38 ip-10-0-9-145 server: 08:53:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Session check completed in 0 ms.
        Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.750 [Thread-13] ERROR o.g.g.w.GuacamoleWebSocketTunnelEndpoint - Connection to guacd terminated abnormally: Connection to guacd timed out.
        Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.751 [Thread-13] DEBUG o.g.g.w.GuacamoleWebSocketTunnelEndpoint - Internal error during connection to guacd.
        Aug 25 08:53:59 ip-10-0-9-145 server: org.glyptodon.guacamole.GuacamoleUpstreamTimeoutException: Connection to guacd timed out.
        Aug 25 08:53:59 ip-10-0-9-145 server: at org.glyptodon.guacamole.io.ReaderGuacamoleReader.read(ReaderGuacamoleReader.java:185) ~[guacamole-common-0.9.9.jar:na]
        Aug 25 08:53:59 ip-10-0-9-145 server: at org.glyptodon.guacamole.websocket.GuacamoleWebSocketTunnelEndpoint$2.run(GuacamoleWebSocketTunnelEndpoint.java:158) ~[guacamole-common-0.9.9.jar:na]
        Aug 25 08:53:59 ip-10-0-9-145 server: Caused by: java.net.SocketTimeoutException: Read timed out
        Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.read(SocketInputStream.java:170) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at java.io.InputStreamReader.read(InputStreamReader.java:184) ~[na:1.8.0_101]
        Aug 25 08:53:59 ip-10-0-9-145 server: at org.glyptodon.guacamole.io.ReaderGuacamoleReader.read(ReaderGuacamoleReader.java:174) ~[guacamole-common-0.9.9.jar:na]
        Aug 25 08:53:59 ip-10-0-9-145 server: ... 1 common frames omitted
        Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.752 [Thread-13] INFO  o.g.g.net.basic.TunnelRequestService - User "demo" disconnected from connection "ssh connection". Duration: 78346 milliseconds
        Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.752 [Thread-13] DEBUG o.g.g.net.InetGuacamoleSocket - Closing socket to guacd.
        
        Show
        nischay.rathor Nischay Rathor added a comment - - edited More logs when running guacd in debug mode Aug 25 08:52:38 ip-10-0-9-145 server: 08:52:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Checking for expired sessions... Aug 25 08:52:38 ip-10-0-9-145 server: 08:52:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Session check completed in 0 ms. Aug 25 08:52:41 ip-10-0-9-145 server: 08:52:41.354 [http-bio-127.0.0.1-8080-exec-2] DEBUG o.g.g.net.InetGuacamoleSocket - Connecting to guacd at localhost:4444. Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Protocol "ssh" selected Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Connection ID is "$0978da36-0a87-4b07-af7c-1d715b700cdc" Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Current locale does not use UTF-8. Some characters may not render correctly. Aug 25 08:52:41 ip-10-0-9-145 guacd[2170]: Starting client Aug 25 08:52:41 ip-10-0-9-145 server: 08:52:41.406 [http-bio-127.0.0.1-8080-exec-2] INFO o.g.g.net.basic.TunnelRequestService - User "demo" connected to connection "ssh connection" . Aug 25 08:52:53 ip-10-0-9-145 sshd[2183]: error: Could not load host key: /etc/ssh/ssh_host_dsa_key Aug 25 08:52:53 ip-10-0-9-145 systemd-logind: New session 36 of user abc. Aug 25 08:52:53 ip-10-0-9-145 systemd: Started Session 36 of user abc. Aug 25 08:52:53 ip-10-0-9-145 systemd: Starting Session 36 of user abc. Aug 25 08:52:53 ip-10-0-9-145 guacd[2170]: SSH connection successful. Aug 25 08:52:53 ip-10-0-9-145 dbus[738]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Aug 25 08:52:53 ip-10-0-9-145 dbus-daemon: dbus[738]: [system] Activating service name='org.freedesktop.problems' (using servicehelper) Aug 25 08:52:53 ip-10-0-9-145 dbus[738]: [system] Successfully activated service 'org.freedesktop.problems' Aug 25 08:52:53 ip-10-0-9-145 dbus-daemon: dbus[738]: [system] Successfully activated service 'org.freedesktop.problems' Aug 25 08:53:19 ip-10-0-9-145 consul: 2016/08/25 08:53:19 [INFO] memberlist: Marking nqa-apihub-dataflow1-1-1b as failed, suspect timeout reached Aug 25 08:53:19 ip-10-0-9-145 consul[846]: memberlist: Marking nqa-apihub-dataflow1-1-1b as failed, suspect timeout reached Aug 25 08:53:19 ip-10-0-9-145 consul[846]: serf: EventMemberFailed: nqa-apihub-dataflow1-1-1b 10.0.11.69 Aug 25 08:53:19 ip-10-0-9-145 consul: 2016/08/25 08:53:19 [INFO] serf: EventMemberFailed: nqa-apihub-dataflow1-1-1b 10.0.11.69 Aug 25 08:53:38 ip-10-0-9-145 server: 08:53:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Checking for expired sessions... Aug 25 08:53:38 ip-10-0-9-145 server: 08:53:38.556 [pool-1-thread-1] DEBUG o.g.g.n.b.r.a.BasicTokenSessionMap - Session check completed in 0 ms. Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.750 [ Thread -13] ERROR o.g.g.w.GuacamoleWebSocketTunnelEndpoint - Connection to guacd terminated abnormally: Connection to guacd timed out. Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.751 [ Thread -13] DEBUG o.g.g.w.GuacamoleWebSocketTunnelEndpoint - Internal error during connection to guacd. Aug 25 08:53:59 ip-10-0-9-145 server: org.glyptodon.guacamole.GuacamoleUpstreamTimeoutException: Connection to guacd timed out. Aug 25 08:53:59 ip-10-0-9-145 server: at org.glyptodon.guacamole.io.ReaderGuacamoleReader.read(ReaderGuacamoleReader.java:185) ~[guacamole-common-0.9.9.jar:na] Aug 25 08:53:59 ip-10-0-9-145 server: at org.glyptodon.guacamole.websocket.GuacamoleWebSocketTunnelEndpoint$2.run(GuacamoleWebSocketTunnelEndpoint.java:158) ~[guacamole-common-0.9.9.jar:na] Aug 25 08:53:59 ip-10-0-9-145 server: Caused by: java.net.SocketTimeoutException: Read timed out Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.read(SocketInputStream.java:170) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at java.io.InputStreamReader.read(InputStreamReader.java:184) ~[na:1.8.0_101] Aug 25 08:53:59 ip-10-0-9-145 server: at org.glyptodon.guacamole.io.ReaderGuacamoleReader.read(ReaderGuacamoleReader.java:174) ~[guacamole-common-0.9.9.jar:na] Aug 25 08:53:59 ip-10-0-9-145 server: ... 1 common frames omitted Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.752 [ Thread -13] INFO o.g.g.net.basic.TunnelRequestService - User "demo" disconnected from connection "ssh connection" . Duration: 78346 milliseconds Aug 25 08:53:59 ip-10-0-9-145 server: 08:53:59.752 [ Thread -13] DEBUG o.g.g.net.InetGuacamoleSocket - Closing socket to guacd.
        Hide
        mike.jumper Michael Jumper added a comment -

        Nischay Rathor, can you please retry this against the current git and see if this is still an issue?

        If so, please try attaching to the running guacd+SSH process with gdb to obtain a backtrace of the segfault, once it occurs.

        Thanks!

        Show
        mike.jumper Michael Jumper added a comment - Nischay Rathor , can you please retry this against the current git and see if this is still an issue? If so, please try attaching to the running guacd+SSH process with gdb to obtain a backtrace of the segfault, once it occurs. Thanks!
        Hide
        nischay.rathor Nischay Rathor added a comment - - edited

        Michael Jumper I compiled from https://github.com/apache/incubator-guacamole-server master branch. Still seeing same issue.

        Attaching Stack stack strace.

        https://dl.dropboxusercontent.com/u/12181164/junk/sshd-strace.txt
        https://dl.dropboxusercontent.com/u/12181164/junk/guacd-strace.txt

        [Fri Aug 26 07:32:16 2016] guacd[4388]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000]
        [Fri Aug 26 07:34:23 2016] guacd[4551]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000]
        [Fri Aug 26 08:12:43 2016] guacd[6617]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000]
        
        Show
        nischay.rathor Nischay Rathor added a comment - - edited Michael Jumper I compiled from https://github.com/apache/incubator-guacamole-server master branch. Still seeing same issue. Attaching Stack stack strace. https://dl.dropboxusercontent.com/u/12181164/junk/sshd-strace.txt https://dl.dropboxusercontent.com/u/12181164/junk/guacd-strace.txt [Fri Aug 26 07:32:16 2016] guacd[4388]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000] [Fri Aug 26 07:34:23 2016] guacd[4551]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000] [Fri Aug 26 08:12:43 2016] guacd[6617]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000]
        Hide
        nischay.rathor Nischay Rathor added a comment - - edited

        When I am trying to attach pid of gucad to gdb, guacamole UI hangs and connection terminates. Same result when trying to attach sshd pid to gdb. Anything I am missing?

        0x00007f378f3ff67d in recvmsg () at ../sysdeps/unix/syscall-template.S:81
        81     	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
        Missing separate debuginfos, use: debuginfo-install graphite2-1.3.6-1.el7_2.x86_64 harfbuzz-0.9.36-1.el7.x86_64 libffi-3.0.13-16.el7.x86_64 libssh2-1.4.3-10.el7_2.1.x86_64 pango-1.36.8-2.el7.x86_64
        
        Show
        nischay.rathor Nischay Rathor added a comment - - edited When I am trying to attach pid of gucad to gdb, guacamole UI hangs and connection terminates. Same result when trying to attach sshd pid to gdb. Anything I am missing? 0x00007f378f3ff67d in recvmsg () at ../sysdeps/unix/syscall-template.S:81 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) Missing separate debuginfos, use: debuginfo-install graphite2-1.3.6-1.el7_2.x86_64 harfbuzz-0.9.36-1.el7.x86_64 libffi-3.0.13-16.el7.x86_64 libssh2-1.4.3-10.el7_2.1.x86_64 pango-1.36.8-2.el7.x86_64
        Hide
        mike.jumper Michael Jumper added a comment -

        When I am trying to attach pid of gucad to gdb, guacamole UI hangs and connection terminates.

        gdb will suspend the process once it attaches. You will need to use the "continue" command to resume execution.

        Same result when trying to attach sshd pid to gdb.

        Sorry - I didn't mean that you need to debug sshd itself. When guacd establishes a connection, it forks a guard process which handles the actual connection to the remote desktop server (in this case sshd). The PID of that process will be listed in the log messages from guacd when the connection is established.

        That is the single process you will want to attach to when reproducing the issue for the sake of a stacktrace.

        Show
        mike.jumper Michael Jumper added a comment - When I am trying to attach pid of gucad to gdb, guacamole UI hangs and connection terminates. gdb will suspend the process once it attaches. You will need to use the "continue" command to resume execution. Same result when trying to attach sshd pid to gdb. Sorry - I didn't mean that you need to debug sshd itself. When guacd establishes a connection, it forks a guard process which handles the actual connection to the remote desktop server (in this case sshd). The PID of that process will be listed in the log messages from guacd when the connection is established. That is the single process you will want to attach to when reproducing the issue for the sake of a stacktrace.
        Hide
        nischay.rathor Nischay Rathor added a comment -

        HI Michael Jumper, Thanks for the help. Below is the gdb output from guacd, I got nothing from sshd.

        ***** First Attempt ***********
        (gdb) continue
        Continuing.
        
        
        [Thread 0x7f3784fcd700 (LWP 32252) exited]
        [Thread 0x7f376e7fc700 (LWP 32250) exited]
        [Thread 0x7f3785fcf700 (LWP 32276) exited]
        
        Program received signal SIGPIPE, Broken pipe.
        [Switching to Thread 0x7f376effd700 (LWP 32251)]
        0x00007f378f3ff1cd in write () at ../sysdeps/unix/syscall-template.S:81
        81      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
        (gdb)
        Continuing.
        [Thread 0x7f37867d0700 (LWP 32255) exited]
        [Thread 0x7f376effd700 (LWP 32251) exited]
        [Inferior 1 (process 32247) exited normally]
        (gdb)
        The program is not being run.
        
        ****** Second Attempt ***********
        (gdb) continue
        Continuing.
        [Thread 0x7f812bfff700 (LWP 698) exited]
        [Thread 0x7f813196c700 (LWP 695) exited]
        [Thread 0x7f813296e700 (LWP 693) exited]
        
        Program received signal SIGPIPE, Broken pipe.
        [Switching to Thread 0x7f813216d700 (LWP 694)]
        0x00007f813fc1e1cd in write () at ../sysdeps/unix/syscall-template.S:81
        81      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
        
        Show
        nischay.rathor Nischay Rathor added a comment - HI Michael Jumper , Thanks for the help. Below is the gdb output from guacd, I got nothing from sshd. ***** First Attempt *********** (gdb) continue Continuing. [Thread 0x7f3784fcd700 (LWP 32252) exited] [Thread 0x7f376e7fc700 (LWP 32250) exited] [Thread 0x7f3785fcf700 (LWP 32276) exited] Program received signal SIGPIPE, Broken pipe. [Switching to Thread 0x7f376effd700 (LWP 32251)] 0x00007f378f3ff1cd in write () at ../sysdeps/unix/syscall-template.S:81 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) (gdb) Continuing. [Thread 0x7f37867d0700 (LWP 32255) exited] [Thread 0x7f376effd700 (LWP 32251) exited] [Inferior 1 (process 32247) exited normally] (gdb) The program is not being run. ****** Second Attempt *********** (gdb) continue Continuing. [Thread 0x7f812bfff700 (LWP 698) exited] [Thread 0x7f813196c700 (LWP 695) exited] [Thread 0x7f813296e700 (LWP 693) exited] Program received signal SIGPIPE, Broken pipe. [Switching to Thread 0x7f813216d700 (LWP 694)] 0x00007f813fc1e1cd in write () at ../sysdeps/unix/syscall-template.S:81 81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
        Hide
        nischay.rathor Nischay Rathor added a comment -

        Also attaching gcore dump of both guacd and ssh process

        Show
        nischay.rathor Nischay Rathor added a comment - Also attaching gcore dump of both guacd and ssh process https://dl.dropboxusercontent.com/u/12181164/gcore-dump.zip
        Hide
        nischay.rathor Nischay Rathor added a comment - - edited

        I also tried running guacd in docker images. Even seeing same result in docker container instance.

        https://guacamole.incubator.apache.org/doc/gug/guacamole-docker.html

        I ran tree / as root from Guacamole UI running via tomcat behind Nginx

        [root@nqa-guacamole-1-1c 10.0.9.145:/etc/guacamole] docker logs fe1e441f824a
        guacd[1]: INFO:	Guacamole proxy daemon (guacd) version 0.9.9 started
        guacd[1]: INFO:	Listening on host 0.0.0.0, port 4822
        guacd[7]: INFO:	Guacamole connection closed during handshake
        guacd[8]: ERROR:       	Guacamole protocol violation. Perhaps the version of guacamole-client is incompatible with this version of guacd?
        guacd[9]: INFO:	Protocol "ssh" selected
        guacd[9]: INFO:	Connection ID is "$f6a67be6-0bb7-4e29-b178-ae63300c75fb"
        guacd[9]: INFO:	Starting client
        guacd[9]: INFO:	SSH connection successful.
        guacd[14]: INFO:       	Protocol "ssh" selected
        guacd[14]: INFO:       	Connection ID is "$e4e154d4-7e40-427f-9597-fd15c0f41727"
        guacd[14]: INFO:       	Starting client
        guacd[14]: INFO:       	Client disconnected
        guacd[18]: INFO:       	Protocol "ssh" selected
        guacd[18]: INFO:       	Connection ID is "$993cfd29-05a8-43ba-8a67-b5354e455987"
        guacd[18]: INFO:       	Starting client
        guacd[18]: INFO:       	SSH connection successful.
        guacd[18]: ERROR:      	Error reading data
        guacd[18]: INFO:       	Client disconnected
        guacd[18]: INFO:       	SSH connection ended.
        guacd[23]: INFO:       	Protocol "ssh" selected
        guacd[23]: INFO:       	Connection ID is "$091e4cbf-3210-4bf0-ad9d-b2ebd3fec7bc"
        guacd[23]: INFO:       	Starting client
        guacd[23]: INFO:       	SSH connection successful.
        guacd[23]: ERROR:      	Error reading data
        guacd[23]: INFO:       	Client disconnected
        guacd[23]: INFO:       	SSH connection ended.
        
        
        Show
        nischay.rathor Nischay Rathor added a comment - - edited I also tried running guacd in docker images. Even seeing same result in docker container instance. https://guacamole.incubator.apache.org/doc/gug/guacamole-docker.html I ran tree / as root from Guacamole UI running via tomcat behind Nginx [root@nqa-guacamole-1-1c 10.0.9.145:/etc/guacamole] docker logs fe1e441f824a guacd[1]: INFO: Guacamole proxy daemon (guacd) version 0.9.9 started guacd[1]: INFO: Listening on host 0.0.0.0, port 4822 guacd[7]: INFO: Guacamole connection closed during handshake guacd[8]: ERROR: Guacamole protocol violation. Perhaps the version of guacamole-client is incompatible with this version of guacd? guacd[9]: INFO: Protocol "ssh" selected guacd[9]: INFO: Connection ID is "$f6a67be6-0bb7-4e29-b178-ae63300c75fb" guacd[9]: INFO: Starting client guacd[9]: INFO: SSH connection successful. guacd[14]: INFO: Protocol "ssh" selected guacd[14]: INFO: Connection ID is "$e4e154d4-7e40-427f-9597-fd15c0f41727" guacd[14]: INFO: Starting client guacd[14]: INFO: Client disconnected guacd[18]: INFO: Protocol "ssh" selected guacd[18]: INFO: Connection ID is "$993cfd29-05a8-43ba-8a67-b5354e455987" guacd[18]: INFO: Starting client guacd[18]: INFO: SSH connection successful. guacd[18]: ERROR: Error reading data guacd[18]: INFO: Client disconnected guacd[18]: INFO: SSH connection ended. guacd[23]: INFO: Protocol "ssh" selected guacd[23]: INFO: Connection ID is "$091e4cbf-3210-4bf0-ad9d-b2ebd3fec7bc" guacd[23]: INFO: Starting client guacd[23]: INFO: SSH connection successful. guacd[23]: ERROR: Error reading data guacd[23]: INFO: Client disconnected guacd[23]: INFO: SSH connection ended.
        Hide
        mike.jumper Michael Jumper added a comment -

        I also tried running guacd in docker images. Even seeing same result in docker container instance.

        ...

        guacd[18]: ERROR:      	Error reading data
        guacd[18]: INFO:       	Client disconnected
        guacd[18]: INFO:       	SSH connection ended.
        

        This looks different, actually. There's no segfault. It looks like the actual underlying TCP connection to the SSH server somehow dies, and the Guacamole side of things logs the failure and cleans things up in an orderly fashion.

        Also attaching gcore dump of both guacd and ssh process

        Unfortunately, that core dump will only be readable in the context of the specific binary built on your system. It's unlikely that any other build will match things sufficiently.

        Below is the gdb output from guacd ...

        The next time this happens, and you get the prompt from gdb that execution has stopped, please use the "bt" command to produce the stacktrace. Lacking that, gdb only prints the line where the signal was received, which in this case is deep within a system call. The context relevant to debugging Guacamole is further up the chain.

        ... I got nothing from sshd.

        There is no need to run gdb against sshd. Only the guard process which is created by guacd for the SSH connection needs to be debugged. For example, in one of your earlier comments, the segfault in the system logs was:

        [Fri Aug 26 07:32:16 2016] guacd[4388]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000]
        

        The PID of the guacd process that died here was 4388. If you look earlier in the logs, you should find something like:

        guacd[4388]: INFO:	Protocol "ssh" selected
        guacd[4388]: INFO:	Connection ID is "... randomly-generated identifier ..."
        guacd[4388]: INFO:	Starting client
        guacd[4388]: INFO:	SSH connection successful
        

        That would have been the PID to attach gdb to for the sake of debugging. When trying to reproduce this, you should:

        1. Connect to SSH via Guacamole as normal.
        2. Once connected, look at the logs to find the messages above.
        3. Determine the PID from those messages, and attach gdb.
        4. Immediately run gdb's "continue" command to prevent the connection from timing out.
        5. Reproduce the issue.
        6. Once gdb pauses execution due to the segfault, run gdb's "bt" command to produce a stacktrace.

        As there will be multiple threads within the same process handling that SSH connection, it is always possible that gdb will not be within the relevant thread when the segfault signal is received, and that the stacktrace will be bogus ... In that case, the gdb command to use would be "thread apply all bt", which will produce stacktraces for all threads.

        ... That should hopefully be enough information for us to get an idea of where things are going wrong.

        Show
        mike.jumper Michael Jumper added a comment - I also tried running guacd in docker images. Even seeing same result in docker container instance. ... guacd[18]: ERROR: Error reading data guacd[18]: INFO: Client disconnected guacd[18]: INFO: SSH connection ended. This looks different, actually. There's no segfault. It looks like the actual underlying TCP connection to the SSH server somehow dies, and the Guacamole side of things logs the failure and cleans things up in an orderly fashion. Also attaching gcore dump of both guacd and ssh process Unfortunately, that core dump will only be readable in the context of the specific binary built on your system. It's unlikely that any other build will match things sufficiently. Below is the gdb output from guacd ... The next time this happens, and you get the prompt from gdb that execution has stopped, please use the "bt" command to produce the stacktrace. Lacking that, gdb only prints the line where the signal was received, which in this case is deep within a system call. The context relevant to debugging Guacamole is further up the chain. ... I got nothing from sshd. There is no need to run gdb against sshd. Only the guard process which is created by guacd for the SSH connection needs to be debugged. For example, in one of your earlier comments, the segfault in the system logs was: [Fri Aug 26 07:32:16 2016] guacd[4388]: segfault at 10 ip 00007f378f3faba1 sp 00007f37887d3bb8 error 4 in libpthread-2.17.so[7f378f3f1000+16000] The PID of the guacd process that died here was 4388. If you look earlier in the logs, you should find something like: guacd[4388]: INFO: Protocol "ssh" selected guacd[4388]: INFO: Connection ID is "... randomly-generated identifier ..." guacd[4388]: INFO: Starting client guacd[4388]: INFO: SSH connection successful That would have been the PID to attach gdb to for the sake of debugging. When trying to reproduce this, you should: Connect to SSH via Guacamole as normal. Once connected, look at the logs to find the messages above. Determine the PID from those messages, and attach gdb. Immediately run gdb's "continue" command to prevent the connection from timing out. Reproduce the issue. Once gdb pauses execution due to the segfault, run gdb's "bt" command to produce a stacktrace. As there will be multiple threads within the same process handling that SSH connection, it is always possible that gdb will not be within the relevant thread when the segfault signal is received, and that the stacktrace will be bogus ... In that case, the gdb command to use would be "thread apply all bt", which will produce stacktraces for all threads. ... That should hopefully be enough information for us to get an idea of where things are going wrong.
        Hide
        nischay.rathor Nischay Rathor added a comment -

        Hi Michael Jumper, I tried to collect GDB dump for guacd process as you mentioned. I tried it many times. Refer shared link. Below are my observation.

        1. Every time I executed tree / from ui (it dumps approx. 15mb of output to stdout). Guacd crashes in first or second attempt.
        2. I observe only 2 traps messages in dmesg.

        [Tue Aug 30 05:14:17 2016] traps: guacd[26841] general protection ip:7fe7bc0e61df sp:7fe7b48c6c78 error:0 in libguac-client-ssh.so.0.0.0[7fe7bc0d3000+1c000]
        [Tue Aug 30 05:40:36 2016] traps: guacd[28984] general protection ip:7f506858524c sp:7f5061102120 error:0 in libc-2.17.so[7f5068505000+1b7000]
        

        https://dl.dropboxusercontent.com/u/12181164/guacd-gdb-dump.txt

        Show
        nischay.rathor Nischay Rathor added a comment - Hi Michael Jumper , I tried to collect GDB dump for guacd process as you mentioned. I tried it many times. Refer shared link. Below are my observation. 1. Every time I executed tree / from ui (it dumps approx. 15mb of output to stdout). Guacd crashes in first or second attempt. 2. I observe only 2 traps messages in dmesg. [Tue Aug 30 05:14:17 2016] traps: guacd[26841] general protection ip:7fe7bc0e61df sp:7fe7b48c6c78 error:0 in libguac-client-ssh.so.0.0.0[7fe7bc0d3000+1c000] [Tue Aug 30 05:40:36 2016] traps: guacd[28984] general protection ip:7f506858524c sp:7f5061102120 error:0 in libc-2.17.so[7f5068505000+1b7000] https://dl.dropboxusercontent.com/u/12181164/guacd-gdb-dump.txt
        Hide
        nischay.rathor Nischay Rathor added a comment -

        Hi Michael Jumper Any clue what is wrong.

        -Nischay

        Show
        nischay.rathor Nischay Rathor added a comment - Hi Michael Jumper Any clue what is wrong. -Nischay
        Hide
        mike.jumper Michael Jumper added a comment -

        I think I know what's going on here. Originally added as part of GUAC-803, Guacamole's terminal emulator includes a function called guac_terminal_notify() which is used to inform the render loop that changes have been made to the terminal, even though data may not have been written to the terminal file descriptor.

        Unlike the other public guac_terminal_*() functions, the guac_terminal_notify() function is not threadsafe, and performing actions which invoke that function while another thread may be writing to the terminal via guac_terminal_printf() or guac_terminal_write_stdout() results in undefined behavior, and could easily cause the segfault you're seeing. In the case of both SSH and telnet, output to the terminal and scrolling/resizing the terminal are handled by separate threads.

        Show
        mike.jumper Michael Jumper added a comment - I think I know what's going on here. Originally added as part of GUAC-803 , Guacamole's terminal emulator includes a function called guac_terminal_notify() which is used to inform the render loop that changes have been made to the terminal, even though data may not have been written to the terminal file descriptor. Unlike the other public guac_terminal_*() functions, the guac_terminal_notify() function is not threadsafe, and performing actions which invoke that function while another thread may be writing to the terminal via guac_terminal_printf() or guac_terminal_write_stdout() results in undefined behavior, and could easily cause the segfault you're seeing. In the case of both SSH and telnet, output to the terminal and scrolling/resizing the terminal are handled by separate threads.
        Hide
        mike.jumper Michael Jumper added a comment -

        Bumping priority to pending release as this is a reproducible segfault with a known fix.

        Show
        mike.jumper Michael Jumper added a comment - Bumping priority to pending release as this is a reproducible segfault with a known fix.
        Hide
        prudvi Prudvi Raju added a comment -

        I am also facing this issue for the guacamole,

        ERROR o.a.g.w.GuacamoleWebSocketTunnelEndpoint - Connection to guacd terminated abnormally: Connection to guacd timed out.
        13:05:29.245 [Thread-10] INFO o.a.g.tunnel.TunnelRequestService - User "admin" disconnected from connection "general". Duration: 12632540 milliseconds

        very few times, But i am not able reproduce it my self.. it occurs automatically..

        Is there any specific way to regenrete/reproduce this issue?

        Show
        prudvi Prudvi Raju added a comment - I am also facing this issue for the guacamole, ERROR o.a.g.w.GuacamoleWebSocketTunnelEndpoint - Connection to guacd terminated abnormally: Connection to guacd timed out. 13:05:29.245 [Thread-10] INFO o.a.g.tunnel.TunnelRequestService - User "admin" disconnected from connection "general". Duration: 12632540 milliseconds very few times, But i am not able reproduce it my self.. it occurs automatically.. Is there any specific way to regenrete/reproduce this issue?
        Hide
        mike.jumper Michael Jumper added a comment -

        Prudvi Raju, unless there are other log entries which show that a segfault occurred within guacd at the time of the disconnect, there is insufficient information to conclude that the problem you're seeing is the same as that reported here.

        If one of your users is consistently being disconnected, and you suspect it might be due to this issue, I'd say try building guacd from the latest git and see if the problem disappears.

        Show
        mike.jumper Michael Jumper added a comment - Prudvi Raju , unless there are other log entries which show that a segfault occurred within guacd at the time of the disconnect, there is insufficient information to conclude that the problem you're seeing is the same as that reported here. If one of your users is consistently being disconnected, and you suspect it might be due to this issue, I'd say try building guacd from the latest git and see if the problem disappears.

          People

          • Assignee:
            mike.jumper Michael Jumper
            Reporter:
            nischay.rathor Nischay Rathor
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development