Uploaded image for project: 'Qpid Dispatch'
  1. Qpid Dispatch
  2. DISPATCH-902

Intermittent crash with link to broker when broker closed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 1.0.0
    • 1.1.0
    • None
    • None

    Description

      When using dispatch in a 2-node configuration with a broker between them:

              9002           10001           10001            9003
      sender ----> dispatch1 -----> qpid-cpp -----> dispatch2 -----> receiver
      

      and initializing in the following order:

      1. start dispatch1
      2. start dispatch2
      3. start qpid-cpp
      4. wait for "Link Route Activated" messages on both dispatch nodes
      5. stop qpid-cpp

      then the dispatch nodes will core after a random amount of time and after sending a random number of

      (info) Connection to localhost:10001 failed: proton:io Connection refused - on read from localhost:10001
      

      messages.

      The stack trace is as follows for all occurrences:

      Thread 3 "qdrouterd" received signal SIGSEGV, Segmentation fault.
      [Switching to Thread 0x7fffea269700 (LWP 10954)]
      pn_transport_tail_closed (transport=0x0) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/transport.c:3044
      3044	bool pn_transport_tail_closed(pn_transport_t *transport) { return transport->tail_closed; }
      (gdb) thread apply all bt
      
      Thread 5 (Thread 0x7fffe9267700 (LWP 10956)):
      #0  0x00007ffff67eb6d3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
      #1  0x00007ffff77327e2 in proactor_do_epoll (p=0x89b550, can_block=can_block@entry=true) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1978
      #2  0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
      #3  0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
      #4  0x00007ffff75185ca in start_thread (arg=0x7fffe9267700) at pthread_create.c:333
      #5  0x00007ffff67eb0cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      
      Thread 4 (Thread 0x7fffe9a68700 (LWP 10955)):
      #0  0x00007ffff67eb6d3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
      #1  0x00007ffff77327e2 in proactor_do_epoll (p=0x89b550, can_block=can_block@entry=true) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1978
      #2  0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
      #3  0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
      #4  0x00007ffff75185ca in start_thread (arg=0x7fffe9a68700) at pthread_create.c:333
      #5  0x00007ffff67eb0cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      
      Thread 3 (Thread 0x7fffea269700 (LWP 10954)):
      #0  pn_transport_tail_closed (transport=0x0) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/transport.c:3044
      #1  0x00007ffff794f4f9 in pn_connection_driver_read_closed (d=d@entry=0x7fffdc054288) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/connection_driver.c:109
      #2  0x00007ffff7731ef1 in pconnection_rclosed (pc=0x7fffdc053ce0) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:898
      #3  pconnection_process (pc=0x7fffdc053ce0, events=<optimized out>, timeout=timeout@entry=false, topup=topup@entry=false) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1084
      #4  0x00007ffff7732945 in proactor_do_epoll (p=0x89b550, can_block=can_block@entry=true) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2007
      #5  0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
      #6  0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
      #7  0x00007ffff75185ca in start_thread (arg=0x7fffea269700) at pthread_create.c:333
      #8  0x00007ffff67eb0cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      
      Thread 2 (Thread 0x7fffeaa6a700 (LWP 10953)):
      #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
      #1  0x00007ffff7ba2949 in sys_cond_wait (cond=<optimized out>, held_mutex=<optimized out>) at /home/kpvdr/RedHat/qpid-dispatch/src/posix/threading.c:91
      #2  0x00007ffff7bb0cf5 in router_core_thread (arg=0x8f8c90) at /home/kpvdr/RedHat/qpid-dispatch/src/router_core/router_core_thread.c:66
      #3  0x00007ffff75185ca in start_thread (arg=0x7fffeaa6a700) at pthread_create.c:333
      #4  0x00007ffff67eb0cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      
      Thread 1 (Thread 0x7ffff7fbb180 (LWP 10946)):
      #0  0x00007ffff67eb6d3 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
      #1  0x00007ffff77327e2 in proactor_do_epoll (p=0x89b550, can_block=can_block@entry=true) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1978
      #2  0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
      #3  0x00007ffff7bbc219 in thread_run (arg=arg@entry=0x89ec20) at /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
      #4  0x00007ffff7bbc2f0 in qd_server_run (qd=<optimized out>) at /home/kpvdr/RedHat/qpid-dispatch/src/server.c:1186
      #5  0x00000000004017dc in main_process (config_path=0x7fffffffda56 "/home/kpvdr/RedHat/install/etc/qpid-dispatch/qdrouterd.node2.conf", python_pkgdir=<optimized out>, fd=2)
          at /home/kpvdr/RedHat/qpid-dispatch/router/src/main.c:111
      #6  0x00000000004015ec in main (argc=3, argv=0x7fffffffd638) at /home/kpvdr/RedHat/qpid-dispatch/router/src/main.c:318
      

      More detail:

      (gdb) bt full
      #0  pn_transport_tail_closed (transport=0x0) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/transport.c:3044
      No locals.
      #1  0x00007ffff794f4f9 in pn_connection_driver_read_closed (d=d@entry=0x7fffe0071108) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/core/connection_driver.c:109
      No locals.
      #2  0x00007ffff7731ef1 in pconnection_rclosed (pc=0x7fffe0070b60) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:898
      No locals.
      #3  pconnection_process (pc=0x7fffe0070b60, events=<optimized out>, timeout=timeout@entry=false, topup=topup@entry=false) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:1084
              inbound_wake = <optimized out>
              rearm_timer = <optimized out>
              timer_fired = <optimized out>
              waking = false
              tick_required = false
              rearm_pc = <optimized out>
      #4  0x00007ffff7732945 in proactor_do_epoll (p=0x89b550, can_block=can_block@entry=true) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2007
              batch = 0x0
              ev = {events = 21, data = {ptr = 0x7fffe0070b70, fd = -536409232, u32 = 3758558064, u64 = 140736951946096}}
              n = <optimized out>
              ee = 0x7fffe0070b70
              timeout = -1
      #5  0x00007ffff77337ca in pn_proactor_wait (p=<optimized out>) at /home/kpvdr/RedHat/qpid-proton/proton-c/src/proactor/epoll.c:2025
      No locals.
      #6  0x00007ffff7bbc219 in thread_run (arg=0x89ec20) at /home/kpvdr/RedHat/qpid-dispatch/src/server.c:932
              events = <optimized out>
              e = <optimized out>
              qd_server = 0x89ec20
              running = true
      #7  0x00007ffff75185ca in start_thread (arg=0x7fffe9a68700) at pthread_create.c:333
              __res = <optimized out>
              pd = 0x7fffe9a68700
              now = <optimized out>
              unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737113392896, -7680037156526760930, 140737488344079, 4096, 140737113392896, 140737113393600, 7679998188122266654, 7680018654140947486}, 
                    mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
              not_first_call = <optimized out>
              pagesize_m1 = <optimized out>
              sp = <optimized out>
              freesize = <optimized out>
      #8  0x00007ffff67eb0cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      No locals.
      

      Dispatch, qpid-cpp and Proton are all built from master yesterday (Dec 14).

      Attachments

        1. testme.tgz
          1 kB
          Alan Conway
        2. qpidd.d2n.conf
          1 kB
          Kim van der Riet
        3. qdrouterd.node1.conf
          2 kB
          Kim van der Riet
        4. qdrouterd.node2.conf
          2 kB
          Kim van der Riet

        Issue Links

          Activity

            People

              gmurthy Ganesh Murthy
              kpvdr Kim van der Riet
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: