Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2311

assert in setup_random

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.4.7, 3.5.1
    • 3.4.8, 3.5.2, 3.6.0
    • c client
    • None

    Description

      We've started seeing an assert failing inside setup_random at line 537:

       528 static void setup_random()
       529 {
       530 #ifndef _WIN32          // TODO: better seed
       531     int seed;
       532     int fd = open("/dev/urandom", O_RDONLY);
       533     if (fd == -1) {
       534         seed = getpid();
       535     } else {
       536         int rc = read(fd, &seed, sizeof(seed));
       537         assert(rc == sizeof(seed));
       538         close(fd);
       539     }
       540     srandom(seed);
       541     srand48(seed);
       542 #endif
      

      The core files show:

      Program terminated with signal 6, Aborted.
      #0 0x00007f9ff665a0d5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
      #0 0x00007f9ff665a0d5 in raise () from /lib/x86_64-linux-gnu/libc.so.6
      #1 0x00007f9ff665d83b in abort () from /lib/x86_64-linux-gnu/libc.so.6
      #2 0x00007f9ff6652d9e in ?? () from /lib/x86_64-linux-gnu/libc.so.6
      #3 0x00007f9ff6652e42 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
      #4 0x00007f9ff8e4070a in setup_random () at src/zookeeper.c:476
      #5 0x00007f9ff8e40d76 in resolve_hosts (zh=0x7f9fe14de400, hosts_in=0x7f9fd700f400 "10.26.200.6:2181,10.26.200.7:2181,10.26.200.8:2181", avec=0x7f9fd87fab60) at src/zookeeper.c:730
      #6 0x00007f9ff8e40e87 in update_addrs (zh=0x7f9fe14de400) at src/zookeeper.c:801
      #7 0x00007f9ff8e44176 in zookeeper_interest (zh=0x7f9fe14de400, fd=0x7f9fd87fac4c, interest=0x7f9fd87fac50, tv=0x7f9fd87fac80) at src/zookeeper.c:1980
      #8 0x00007f9ff8e553f5 in do_io (v=0x7f9fe14de400) at src/mt_adaptor.c:379
      #9 0x00007f9ff804de9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
      #10 0x00007f9ff671738d in clone () from /lib/x86_64-linux-gnu/libc.so.6
      #11 0x0000000000000000 in ?? ()

      I'm not sure what the underlying cause of this is... But POSIX always allows for a short read(2), and any program MUST check for short reads...

      Has anyone else encountered this issue? We are seeing it rather frequently which is concerning.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            marshall Marshall McMullen Assign to me
            marshall Marshall McMullen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment