Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4744

Zookeeper fails to start after power failure

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 3.7.1
    • None
    • None
    • None

    Description

      The underlying issue stems from consecutive writes to the log file that are not interleaved with fsync operations. This is a well-documented behavior of operating systems, and there are several references addressing this problem:

      This issue can be replicated using LazyFS, a file system capable of simulating power failures and exhibiting the OS behavior mentioned above, i.e., the out-of-order file writes at the disk level. LazyFS persists these writes out of order and then crashes to simulate a power failure.

      To reproduce this problem, one can follow these steps:

      1. Mount LazyFS on a directory where ZooKeeper data will be saved, with a specified root directory. Assuming the data path for ZooKeeper is /home/data/zk and the root directory is /home/data/zk-root, add the following lines to the default configuration file (located in the config/default.toml directory):

      [[injection]
      type="reorder" 
      occurrence=1 
      op="write" 
      file="/home/data/zk-root/version-2/log.100000001" 
      persist=[3]

      These lines define a fault to be injected. A power failure will be simulated after the third write to the /home/data/zk-root/version-2/log.100000001 file. The `occurrence` parameter allows specifying that this is the first group where this happens, as there might be more than one group of consecutive writes.

      2. Start LazyFS as the underlying file system of a node_ in the cluster with the following command:

           ./scripts/mount-lazyfs.sh -c config/default.toml -m /home/data/zk -r /home/data/zk-root -f

      3. Start ZooKeeper with the command:
           apache-zookeeper-3.7.1-bin/bin/zkServer.sh start-foreground

        4. Connect a client to the node that has LazyFS as the underlying file system:

                apache-zookeeper-3.7.1-bin/bin/zkCli.sh -server 127.0.0.1:2181

      Immediately after this step, LazyFS will be unmounted, simulating a power failure, and ZooKeeper will keep printing error messages in the terminal, requiring a forced shutdown.
      At this point, one can analyze the logs produced by LazyFS to examine the system calls issued up to the moment of the fault. Here is a simplified version of the log:

      {'syscall': 'create', 'path': '/home/gsd/data/zk37-root/version-2/log.100000001', 'mode': 'O_TRUNC'} {'syscall': 'write', 'path': '/home/data/zk37-root/version-2/log.100000001', 'size': '16', 'off': '0'} {'syscall': 'write', 'path': '/home/data/zk37-root/version-2/log.100000001', 'size': '1', 'off': '67108879'} {'syscall': 'write', 'path': '/home/data/zk37-root/version-2/log.100000001', 'size': '67108863', 'off': '16'} {'syscall': 'write', 'path': '/home/data/zk37-root/version-2/log.100000001', 'size': '61', 'off': '16'}

      Note that the third write is issued by LazyFS for padding.

       
      5. Remove the fault from the configuration file, unmount the file system with

                 fusermount -uz /home/data/zk

      6. Mount LazyFS again with the previously provided command.

        7. Attempt to start ZooKeeper (it fails).

      By following these steps, one can replicate the issue and analyze the effects of the power failure on ZooKeeper's restart process.

      Attachments

        1. reported_error.txt
          16 kB
          Maria Ramos

        Activity

          People

            Unassigned Unassigned
            mjcr Maria Ramos
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: