Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-1995

TServer refuses to start if WALog file has only header and no entries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.6.0
    • 1.6.0
    • None
    • None

    Description

      In the course of exercising Accumulo 1.6.0 with encryption turned on, we were able to get tablet servers into the following state many times, wherein the tablet server would not start up and instead just print this over and over to the log:

      2013-12-09 14:13:04,487 [util.MetadataTableUtil] ERROR: java.lang.IllegalArgumentException: Invalid path string "/accumulo/0147d545-64bd-4383-b842-27c62283c208/root_tablet/walogs/hdfs://10.10.1.10:9000/accumulo/wal/10.10.1.10+9997/e42efc86-6d83-4c0e-abd6-b16ec74d0a9f" caused by empty node name specified @72
      java.lang.IllegalArgumentException: Invalid path string "/accumulo/0147d545-64bd-4383-b842-27c62283c208/root_tablet/walogs/hdfs://10.10.1.10:9000/accumulo/wal/10.10.1.10+9997/e42efc86-6d83-4c0e-abd6-b16ec74d0a9f" caused by empty node name specified @72
      	at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:99)
      	at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1450)
      	at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
      	at org.apache.accumulo.fate.zookeeper.ZooUtil.recursiveDelete(ZooUtil.java:103)
      	at org.apache.accumulo.fate.zookeeper.ZooUtil.recursiveDelete(ZooUtil.java:117)
      	at org.apache.accumulo.fate.zookeeper.ZooReaderWriter.recursiveDelete(ZooReaderWriter.java:64)
      	at org.apache.accumulo.server.util.MetadataTableUtil.removeUnusedWALEntries(MetadataTableUtil.java:611)
      	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1394)
      	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1236)
      	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1091)
      	at org.apache.accumulo.tserver.Tablet.<init>(Tablet.java:1079)
      	at org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2892)
      	at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
      	at org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$3.run(TabletServer.java:2261)
      

      The log file in question had a good header but no entries in it, probably the result of killing a tablet server before it had any data to write. The bug in the code has to do with how log file names are now held within ZooKeeper (they have a full path) versus how the paths to their metadata within ZooKeeper are eventually constructed by MetadataTableUtil. The fix is relatively straightforward; you just need to take apart the full HDFS path before appending it to a different ZooKeeper path.

      Patch coming.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              supermallen Michael Allen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: