Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: HA branch (HDFS-1623)
    • Fix Version/s: None
    • Component/s: ha
    • Labels:
      None

      Description

      We don't support multiple shared edits dirs, we should fail to start with an error in this case.

      1. hdfs-2752.txt
        8 kB
        Eli Collins

        Activity

        Eli Collins created issue -
        Hide
        Suresh Srinivas added a comment -

        Why is this an error?

        Show
        Suresh Srinivas added a comment - Why is this an error?
        Hide
        Eli Collins added a comment -

        It's not currently, but it should be until we support it (see HDFS-2735, needs tests and fixes).

        Show
        Eli Collins added a comment - It's not currently, but it should be until we support it (see HDFS-2735 , needs tests and fixes).
        Hide
        Suresh Srinivas added a comment -

        Eli, I am confused a bit. What is the special required in supporting multiple shared directories that is not handled currently? Why turn it off because tests are not there?

        Show
        Suresh Srinivas added a comment - Eli, I am confused a bit. What is the special required in supporting multiple shared directories that is not handled currently? Why turn it off because tests are not there?
        Hide
        Eli Collins added a comment -

        Sorry for the lack of context, this came up in HDFS-2709. With multiple shared edits dirs a failure to read from one of them will prevent the edit log tailer from catching up, ie users are currently less reliable with multiple shared dirs. Until we know we've got it working reasonably well (ie have tests for the common scenarios) it doesn't seem like we should let people shoot themselves in the foot. And while nice to have, it doesn't seem like multiple shared dir support should block an initial release. Agree? Perhaps warn loudly instead of exit?

        Show
        Eli Collins added a comment - Sorry for the lack of context, this came up in HDFS-2709 . With multiple shared edits dirs a failure to read from one of them will prevent the edit log tailer from catching up, ie users are currently less reliable with multiple shared dirs. Until we know we've got it working reasonably well (ie have tests for the common scenarios) it doesn't seem like we should let people shoot themselves in the foot. And while nice to have, it doesn't seem like multiple shared dir support should block an initial release. Agree? Perhaps warn loudly instead of exit?
        Hide
        Todd Lipcon added a comment -

        I agree with Eli - we don't currently use the JournalSet abstraction in EditLogTailer, so it can only use a single shared dir. Of course in the future we should support using multiple, but it adds some complexity to the initial release.

        Show
        Todd Lipcon added a comment - I agree with Eli - we don't currently use the JournalSet abstraction in EditLogTailer, so it can only use a single shared dir. Of course in the future we should support using multiple, but it adds some complexity to the initial release.
        Hide
        Eli Collins added a comment -

        Patch attached. Running the full test suite for sanity. I left the dupe detection for shared dirs in place since we'll need it when we add multiple shared dir support.

        Show
        Eli Collins added a comment - Patch attached. Running the full test suite for sanity. I left the dupe detection for shared dirs in place since we'll need it when we add multiple shared dir support.
        Eli Collins made changes -
        Field Original Value New Value
        Attachment hdfs-2752.txt [ 12513367 ]
        Hide
        Todd Lipcon added a comment -

        Just one nit: you can remove the following empty javadoc annotation:

        +   * @param conf
        

        Aside from that, +1.

        Show
        Todd Lipcon added a comment - Just one nit: you can remove the following empty javadoc annotation: + * @param conf Aside from that, +1.
        Hide
        Eli Collins added a comment -

        Thanks for the review Todd. Fixed the nit and committed.

        Show
        Eli Collins added a comment - Thanks for the review Todd. Fixed the nit and committed.
        Eli Collins made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-HAbranch-build #70 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/70/)
        HDFS-2752. HA: exit if multiple shared dirs are configured. Contributed by Eli Collins

        eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1240916
        Files :

        • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
        • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
        • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
        • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java
        • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureOfSharedDir.java
        • /hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAConfiguration.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-HAbranch-build #70 (See https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/70/ ) HDFS-2752 . HA: exit if multiple shared dirs are configured. Contributed by Eli Collins eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1240916 Files : /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/CHANGES. HDFS-1623 .txt /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureOfSharedDir.java /hadoop/common/branches/ HDFS-1623 /hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAConfiguration.java
        Hide
        Jitendra Nath Pandey added a comment -

        I agree with Eli - we don't currently use the JournalSet abstraction in EditLogTailer, so it can only use a single shared dir. Of course in the future we should support using multiple, but it adds some complexity to the initial release.

        EditLogTailer uses FSEditLog which uses JournalSet. I think it should be able to handle multiple shared edits, unless there is another bug.

        Show
        Jitendra Nath Pandey added a comment - I agree with Eli - we don't currently use the JournalSet abstraction in EditLogTailer, so it can only use a single shared dir. Of course in the future we should support using multiple, but it adds some complexity to the initial release. EditLogTailer uses FSEditLog which uses JournalSet. I think it should be able to handle multiple shared edits, unless there is another bug.
        Hide
        Todd Lipcon added a comment -

        The other issue is that, in order to support multiple shared edits, we'd need a quorum-like behavior rather than the current "at least one" behavior. Otherwise you could imagine that, with two shared dirs (SD1 and SD2) and two NNs, you might have the case where NN1 is writing to only SD1 and NN2 is reading from only SD2. Let's continue this discussion on HDFS-2782

        Show
        Todd Lipcon added a comment - The other issue is that, in order to support multiple shared edits, we'd need a quorum-like behavior rather than the current "at least one" behavior. Otherwise you could imagine that, with two shared dirs (SD1 and SD2) and two NNs, you might have the case where NN1 is writing to only SD1 and NN2 is reading from only SD2. Let's continue this discussion on HDFS-2782

          People

          • Assignee:
            Eli Collins
            Reporter:
            Eli Collins
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development