Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3399 BookKeeper option support for NN HA
  3. HDFS-3423

BKJM: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes

    Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 2.0.2-alpha
    • Component/s: None
    • Labels:
      None

      Description

      Say, the InProgress_000X node is corrupted due to not writing the data(version, ledgerId, firstTxId) to this inProgress_000X znode. Namenode startup has the logic to recover all the unfinalized segments, here will try to read the segment and getting shutdown.

      EditLogLedgerMetadata.java:
      
      static EditLogLedgerMetadata read(ZooKeeper zkc, String path)
            throws IOException, KeeperException.NoNodeException  {
            byte[] data = zkc.getData(path, false, null);
            String[] parts = new String(data).split(";");
            if (parts.length == 3)
               ....reading inprogress metadata
            else if (parts.length == 4)
               ....reading inprogress metadata
            else
              throw new IOException("Invalid ledger entry, "
                                    + new String(data));
            }
      

      Scenario:- Leaving bad inProgress_000X node ?
      Assume BKJM has created the inProgress_000X zNode and ZK is not available when trying to add the metadata. Now, inProgress_000X ends up with partial information.

      1. HDFS-3423.patch
        21 kB
        Uma Maheswara Rao G
      2. HDFS-3423.patch
        21 kB
        Uma Maheswara Rao G
      3. HDFS-3423.diff
        20 kB
        Ivan Kelly
      4. HDFS-3423.diff
        20 kB
        Ivan Kelly
      5. HDFS-3423.diff
        20 kB
        Ivan Kelly

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Ivan Kelly
              Reporter:
              Rakesh R
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development