Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    1. HADOOP-6001.patch
      19 kB
      Luca Telloli
    2. HADOOP-6001.patch
      19 kB
      Luca Telloli
    3. HADOOP-6001.patch
      18 kB
      Luca Telloli
    4. HDFS-397.patch
      20 kB
      Luca Telloli
    5. HDFS-397.patch
      18 kB
      Luca Telloli

      Issue Links

        Activity

        Luca Telloli created issue -
        Hide
        Luca Telloli added a comment -

        StorageDirectory is currently a class used to manage both FSIMAGE and edits.
        When used for edits, it could be incorporated into the EditLogFileInput/Output classes.
        This should help the development of HADOOP-5188

        Show
        Luca Telloli added a comment - StorageDirectory is currently a class used to manage both FSIMAGE and edits. When used for edits, it could be incorporated into the EditLogFileInput/Output classes. This should help the development of HADOOP-5188
        Hide
        Luca Telloli added a comment -

        Initial patch.

        • Now EditLogFileOutputStream handles StorageDirectory directly.
        • Previous methods to revert and divert file streams have been moved inside this class, since they're file related
        • The header of some methods in FSEditLog has been replaced by one which handles storage directories instead of files
        • Method FSEditLog.revertFileStreams() has been removed.

        Tested for core-tests matching namenode/Test* and it passed.

        Show
        Luca Telloli added a comment - Initial patch. Now EditLogFileOutputStream handles StorageDirectory directly. Previous methods to revert and divert file streams have been moved inside this class, since they're file related The header of some methods in FSEditLog has been replaced by one which handles storage directories instead of files Method FSEditLog.revertFileStreams() has been removed. Tested for core-tests matching namenode/Test* and it passed.
        Luca Telloli made changes -
        Field Original Value New Value
        Attachment HADOOP-6001.patch [ 12410418 ]
        Luca Telloli made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Luca Telloli made changes -
        Link This issue blocks HADOOP-5189 [ HADOOP-5189 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12410418/HADOOP-6001.patch
        against trunk revision 785025.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12410418/HADOOP-6001.patch against trunk revision 785025. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/501/console This message is automatically generated.
        Hide
        Luca Telloli added a comment - - edited

        As far as I understand from the console output of Hudson

        • if I'm not mistaken, the javadoc warning is not related directly to the content of the patch
           [exec]   [javadoc] javadoc: warning - Error fetching URL: http://java.sun.com/javase/6/docs/api/package-list
          
        • the contrib test failing is the one related to HADOOP-6007

        so this patch to me seems ready to be reviewed and eventually committed

        Show
        Luca Telloli added a comment - - edited As far as I understand from the console output of Hudson if I'm not mistaken , the javadoc warning is not related directly to the content of the patch [exec] [javadoc] javadoc: warning - Error fetching URL: http://java.sun.com/javase/6/docs/api/package-list the contrib test failing is the one related to HADOOP-6007 so this patch to me seems ready to be reviewed and eventually committed
        Hide
        Luca Telloli added a comment -

        Added some Javadoc documentation. No real code has been modified.

        Show
        Luca Telloli added a comment - Added some Javadoc documentation. No real code has been modified.
        Luca Telloli made changes -
        Attachment HADOOP-6001.patch [ 12411118 ]
        Hide
        Benjamin Reed added a comment -

        the patch looks fine, but you haven't motivated it. it looks like you are just doing code cleanup. can you explain why you must do this to plug in bookkeeper?

        Show
        Benjamin Reed added a comment - the patch looks fine, but you haven't motivated it. it looks like you are just doing code cleanup. can you explain why you must do this to plug in bookkeeper?
        Hide
        Luca Telloli added a comment -

        I would like to avoid in HADOOP-5189 situations like:

        if (stream is of type Bookkeeper)
        do this
        else
        do that

        Code in FSEditlog is mostly file related and many methods (rollEditLog, purgeEditLog) manipulate files. Storage directories, for instance, are an example of file-based logging.

        This patch should help to isolate possibly all situations where the file-dependent calls are made in FSEditLog.

        I was also hoping that dividing the steps in multiple independent sub-patches for 5188/5189 would speed up commit

        Show
        Luca Telloli added a comment - I would like to avoid in HADOOP-5189 situations like: if (stream is of type Bookkeeper) do this else do that Code in FSEditlog is mostly file related and many methods (rollEditLog, purgeEditLog) manipulate files. Storage directories, for instance, are an example of file-based logging. This patch should help to isolate possibly all situations where the file-dependent calls are made in FSEditLog. I was also hoping that dividing the steps in multiple independent sub-patches for 5188/5189 would speed up commit
        Hide
        Luca Telloli added a comment -

        I guess we need some additional clarification on the purpose of this patch and on the interrelationship between this one and the other ones we're working on (5188, 5189, 5832, etc).

        The overall goal of 5188 is simple: to have the EditLogInput/OuputStream to properly incapsulate everything is needed for log of NameNode operations, and FSEditLog to call the same methods independently of the type of stream it's working on.

        Then, with 5832 I moved a step towards this goal: to treat the properties in the .xml file which specify the log devices and their parameters as a URI. While deploying that patch I realized that I couldn't pass a URI as argument to EditLogFileInput/Output Stream, because the constructor was dependent on the role of the file. By role of the file, I mean edits or edits.new, two different files used in file-based logging to take care of changes. You can see that if you look at the original code for FSEditLog, when EditLogFileOutputStream is instantiated at times with the EDITS file, and some other times with the EDITS_NEW file. This is bad, because it breaks generality.

        With some simple modifications, contained in this patch, I can now pass to EditLogFileOutputStream the container of these files, which is a StorageDirectory and, depending on the call from FSEditLog, I can switch between edits and edits.new inside EditLogFileOutputStream, making it transparent to FSEditLog which doesn't have to instantiate this class based on a specific file and its role.
        At the same time, a StorageDirectory is exactly what is passed inside the URI of a file-based logger, which connects this patch to 5832.

        I hope this clarifies some of the doubt on this patch, but I'm happy to answer more questions if they come up.

        Show
        Luca Telloli added a comment - I guess we need some additional clarification on the purpose of this patch and on the interrelationship between this one and the other ones we're working on (5188, 5189, 5832, etc). The overall goal of 5188 is simple: to have the EditLogInput/OuputStream to properly incapsulate everything is needed for log of NameNode operations, and FSEditLog to call the same methods independently of the type of stream it's working on. Then, with 5832 I moved a step towards this goal: to treat the properties in the .xml file which specify the log devices and their parameters as a URI. While deploying that patch I realized that I couldn't pass a URI as argument to EditLogFileInput/Output Stream, because the constructor was dependent on the role of the file. By role of the file, I mean edits or edits.new, two different files used in file-based logging to take care of changes. You can see that if you look at the original code for FSEditLog, when EditLogFileOutputStream is instantiated at times with the EDITS file, and some other times with the EDITS_NEW file. This is bad, because it breaks generality. With some simple modifications, contained in this patch, I can now pass to EditLogFileOutputStream the container of these files, which is a StorageDirectory and, depending on the call from FSEditLog, I can switch between edits and edits.new inside EditLogFileOutputStream, making it transparent to FSEditLog which doesn't have to instantiate this class based on a specific file and its role. At the same time, a StorageDirectory is exactly what is passed inside the URI of a file-based logger, which connects this patch to 5832. I hope this clarifies some of the doubt on this patch, but I'm happy to answer more questions if they come up.
        Owen O'Malley made changes -
        Project Hadoop Common [ 12310240 ] HDFS [ 12310942 ]
        Key HADOOP-6001 HDFS-397
        Hide
        Flavio Junqueira added a comment -

        Thanks for the clarification, Luca. I'm not sure I understand your last comment on StorageDirectory being passed inside a URI. I understand that StorageDirectory gives you some isolation with respect to which file has to be used (EDITS or EDITS_NEW). However, it is still unclear to me the correlation between URI and StorageDirectory. Could you please elaborate a little more?

        I have a couple of comments on the latest patch you posted:

        1. In the modifications to BackupStorage, you have replaced a call to revertFileStreams() with one to rollEditLog(). Inspecting the code of rollEditLog(), I've seen that it calls divertFileStreams() instead. Should you be calling purgeEditLog() instead?
        2. Why have you made the constructor of EditLogFileOutputStream public. From this patch, it doesn't look like it is necessary for it to be public, and it sounds like it should be package protected.
        Show
        Flavio Junqueira added a comment - Thanks for the clarification, Luca. I'm not sure I understand your last comment on StorageDirectory being passed inside a URI. I understand that StorageDirectory gives you some isolation with respect to which file has to be used (EDITS or EDITS_NEW). However, it is still unclear to me the correlation between URI and StorageDirectory. Could you please elaborate a little more? I have a couple of comments on the latest patch you posted: In the modifications to BackupStorage, you have replaced a call to revertFileStreams() with one to rollEditLog(). Inspecting the code of rollEditLog(), I've seen that it calls divertFileStreams() instead. Should you be calling purgeEditLog() instead? Why have you made the constructor of EditLogFileOutputStream public. From this patch, it doesn't look like it is necessary for it to be public, and it sounds like it should be package protected.
        Hide
        Luca Telloli added a comment -

        Flavio,
        about the relationship between StorageDirectory and URI, let's say that, in the URI for a logging device you wanna pass all the relevant information to initialize that device. In the case of a file-based logging, I think that passing the storage directory would be sufficient.

        For instance, the example URI file:///tmp/hdfs would pass a storage directory located in /tmp/hdfs and the implementation inside this patch would only need this information to deal with the log files contained in this directory. As I said earlier, this would imply that there is no more direct instantiation of this class with one of the edit files or the other, so would make this object more general.

        To answer the other questions:
        1. you're right, that call should be to purgeEditLog instead. I'm actually very surprised that the unit test doesn't fail, maybe I should take a look to that test.
        Also, as a note, purgeEditLog just calls revertFileStream(), so in that case I replaced purgeEditLog() with the code of the existing revertFileStream(). rollEditLog() calls divertFileStream() but it performs some other operations before doing that, so I haven't replaced it, although I guess I could merge the two methods for some "omogeneity"
        2. good point, I might have overlooked that detail, thanks

        I'll post a new patch with the fixes soon, thanks for the comments

        Show
        Luca Telloli added a comment - Flavio, about the relationship between StorageDirectory and URI, let's say that, in the URI for a logging device you wanna pass all the relevant information to initialize that device. In the case of a file-based logging, I think that passing the storage directory would be sufficient. For instance, the example URI file:///tmp/hdfs would pass a storage directory located in /tmp/hdfs and the implementation inside this patch would only need this information to deal with the log files contained in this directory. As I said earlier, this would imply that there is no more direct instantiation of this class with one of the edit files or the other, so would make this object more general. To answer the other questions: 1. you're right, that call should be to purgeEditLog instead. I'm actually very surprised that the unit test doesn't fail, maybe I should take a look to that test. Also, as a note, purgeEditLog just calls revertFileStream(), so in that case I replaced purgeEditLog() with the code of the existing revertFileStream(). rollEditLog() calls divertFileStream() but it performs some other operations before doing that, so I haven't replaced it, although I guess I could merge the two methods for some "omogeneity" 2. good point, I might have overlooked that detail, thanks I'll post a new patch with the fixes soon, thanks for the comments
        Luca Telloli made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Luca Telloli added a comment -

        new patch addressed latest comments

        Show
        Luca Telloli added a comment - new patch addressed latest comments
        Luca Telloli made changes -
        Attachment HADOOP-6001.patch [ 12411392 ]
        Luca Telloli made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Raghu Angadi made changes -
        Assignee Luca Telloli [ lucat ]
        Hide
        Flavio Junqueira added a comment -

        +1, it looks good to me.

        Show
        Flavio Junqueira added a comment - +1, it looks good to me.
        Hide
        Luca Telloli added a comment -

        Latest attachment applies correctly against project split.
        A test-core run shows no issues apart from TestFTPFileSystem which does not seem related to this patch. I attach the stacktrace as a proof

        Testcase: testReadWrite took 0.298 sec
            Caused an ERROR
        FTP server start-up failed
        java.lang.RuntimeException: FTP server start-up failed
            at org.apache.hadoop.fs.ftp.TestFTPFileSystem.startServer(TestFTPFileSystem.java:77)
            at org.apache.hadoop.fs.ftp.TestFTPFileSystem.setUp(TestFTPFileSystem.java:89)
        Caused by: org.apache.ftpserver.FtpServerConfigurationException: User data file specified but could not be located, neither on the file system or in the classpath: /Users/telloli/Documents/work/eclipse-workspace/hdfs-trunk/src/test/ftp.user.properties
        

        A full test-patch run was run before the project split (so with the previous patch). Changes between previous and current patch have only been the paths of the patched files, no change in the code has been made

        Show
        Luca Telloli added a comment - Latest attachment applies correctly against project split. A test-core run shows no issues apart from TestFTPFileSystem which does not seem related to this patch. I attach the stacktrace as a proof Testcase: testReadWrite took 0.298 sec Caused an ERROR FTP server start-up failed java.lang.RuntimeException: FTP server start-up failed at org.apache.hadoop.fs.ftp.TestFTPFileSystem.startServer(TestFTPFileSystem.java:77) at org.apache.hadoop.fs.ftp.TestFTPFileSystem.setUp(TestFTPFileSystem.java:89) Caused by: org.apache.ftpserver.FtpServerConfigurationException: User data file specified but could not be located, neither on the file system or in the classpath: /Users/telloli/Documents/work/eclipse-workspace/hdfs-trunk/src/test/ftp.user.properties A full test-patch run was run before the project split (so with the previous patch). Changes between previous and current patch have only been the paths of the patched files, no change in the code has been made
        Luca Telloli made changes -
        Attachment HDFS-397.patch [ 12412017 ]
        Hide
        Raghu Angadi added a comment -

        I am probably not the right person to review...

        I assume this essentially code reorg with no changes in semantics.. looks like one of these is not correct :

        -    editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
        +    editLog.divertFileStreams()
        
        -    divertFileStreams(
        -        Storage.STORAGE_DIR_CURRENT + "/" + NameNodeFile.EDITS_NEW.getName());
        +    divertFileStreams()
        

        Am I reading the patch correctly?

        Could you also point out locations in patch for HDFS-311, where this patch matters? (line numbers in the patch would be enough). thanks.

        Show
        Raghu Angadi added a comment - I am probably not the right person to review... I assume this essentially code reorg with no changes in semantics.. looks like one of these is not correct : - editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE); + editLog.divertFileStreams() - divertFileStreams( - Storage.STORAGE_DIR_CURRENT + "/" + NameNodeFile.EDITS_NEW.getName()); + divertFileStreams() Am I reading the patch correctly? Could you also point out locations in patch for HDFS-311 , where this patch matters? (line numbers in the patch would be enough). thanks.
        Raghu Angadi made changes -
        Link This issue blocks HDFS-311 [ HDFS-311 ]
        Hide
        Luca Telloli added a comment -

        Raghu

        1. the second piece of code is correct (at least for the purpose of the patch)
        The first piece of code I guess should be also correct due to a name clash since

        private static final String STORAGE_JSPOOL_FILE = 
                                                      NameNodeFile.EDITS_NEW.getName();
        

        I assumed it's correct since it passed the unit tests correctly.

        2. in HDFS-311, for instance every time a constructor for a Input/Output stream is called, this patch matters.
        Have a look for instance to line 1497 and following.

        Finally, I think one advantage of this patch is that I achieve to avoid calling unnecessary constructors for Input/Output streams,
        which in previous code belonged to the necessity of strictly keeping the order of streams and the order of storage directories in sync.

        Show
        Luca Telloli added a comment - Raghu 1. the second piece of code is correct (at least for the purpose of the patch) The first piece of code I guess should be also correct due to a name clash since private static final String STORAGE_JSPOOL_FILE = NameNodeFile.EDITS_NEW.getName(); I assumed it's correct since it passed the unit tests correctly. 2. in HDFS-311 , for instance every time a constructor for a Input/Output stream is called, this patch matters. Have a look for instance to line 1497 and following. Finally, I think one advantage of this patch is that I achieve to avoid calling unnecessary constructors for Input/Output streams, which in previous code belonged to the necessity of strictly keeping the order of streams and the order of storage directories in sync.
        Hide
        Raghu Angadi added a comment -

        Regd 1) : What about 'STORAGE_JSPOOL_DIR' which points to a different directory from 'current'.
        JSPOOL_FILE also 'happens' to be same but intention of original author might have been to be able change.
        We should remove these constants if these are not sued.

        I assumed it's correct since it passed the unit tests correctly.

        hmm.. I don't think that is sufficient or wise. When I program I would like to know what I am doing and why it is correct at every line.

        > Have a look for instance to line 1497 and following.
        Is it for the latest patch attached there? I am asking this mainly so that I don't have go through the big patch for HDFS-311.

        Show
        Raghu Angadi added a comment - Regd 1) : What about 'STORAGE_JSPOOL_DIR' which points to a different directory from 'current'. JSPOOL_FILE also 'happens' to be same but intention of original author might have been to be able change. We should remove these constants if these are not sued. I assumed it's correct since it passed the unit tests correctly. hmm.. I don't think that is sufficient or wise. When I program I would like to know what I am doing and why it is correct at every line. > Have a look for instance to line 1497 and following. Is it for the latest patch attached there? I am asking this mainly so that I don't have go through the big patch for HDFS-311 .
        Hide
        Luca Telloli added a comment -

        Raghu, for the latest question I suggest to download
        https://issues.apache.org/jira/secure/attachment/12413314/HDFS-311.patch
        and have a look for instance to:
        line 10: calling new EditLogFileInputStream on a StorageDirectory
        line 345 and below: where constructors for new EditLogOutputStream are instantiated and every time a newInstance() is called

        About BackupStorage, yes, there might be a problem there. I have to go a bit in deep with the Backup* architecture and might require some time. At the same time I believe a couple of things:

        • I agree that unit tests are not sufficient, but in general I think they should fail on non-standard non-planned cases. If, given a wrong directory, this test doesn't fail then I assume the test is broken or needs some fixing
        • Since this patch deals with new EditLogFileOutputStream instantiated with a directory, I'm guessing a change to support such a jspool dir can be easily implemented. But at the same time, in the design document for HADOOP-4539 I read "Journal Spool is an analog of 'edits.new'" which tells me a standard storage directory could be used. What's your take on this?
        Show
        Luca Telloli added a comment - Raghu, for the latest question I suggest to download https://issues.apache.org/jira/secure/attachment/12413314/HDFS-311.patch and have a look for instance to: line 10: calling new EditLogFileInputStream on a StorageDirectory line 345 and below: where constructors for new EditLogOutputStream are instantiated and every time a newInstance() is called About BackupStorage, yes, there might be a problem there. I have to go a bit in deep with the Backup* architecture and might require some time. At the same time I believe a couple of things: I agree that unit tests are not sufficient, but in general I think they should fail on non-standard non-planned cases. If, given a wrong directory, this test doesn't fail then I assume the test is broken or needs some fixing Since this patch deals with new EditLogFileOutputStream instantiated with a directory, I'm guessing a change to support such a jspool dir can be easily implemented. But at the same time, in the design document for HADOOP-4539 I read "Journal Spool is an analog of 'edits.new'" which tells me a standard storage directory could be used. What's your take on this?
        Hide
        Raghu Angadi added a comment -

        > [...] What's your take on this?

        This is expected to be only a minor reorg of code. But it is not, it does not matter weather those unintended changes are correct or not. btw, the bug would show up during upgrade I guess.
        If the patch made sure it didn't change any semantics, either of us need not go through or completely follow what backup namenode does.

        I think a clear and simple bug in the patch is taking a lot of effort to resolve.
        It is very unrealistic to expect tests to catch every random bug and typo in a patch (for anything I have ever worked on or known.)

        I guess my main point is : there is no change in semantics expected and lets make sure that is is the case. If there is a functional change, please make that explicit.

        Thanks for the references to patch in HDFS-311.. will take a look.

        Show
        Raghu Angadi added a comment - > [...] What's your take on this? This is expected to be only a minor reorg of code. But it is not, it does not matter weather those unintended changes are correct or not. btw, the bug would show up during upgrade I guess. If the patch made sure it didn't change any semantics, either of us need not go through or completely follow what backup namenode does. I think a clear and simple bug in the patch is taking a lot of effort to resolve. It is very unrealistic to expect tests to catch every random bug and typo in a patch (for anything I have ever worked on or known.) I guess my main point is : there is no change in semantics expected and lets make sure that is is the case. If there is a functional change, please make that explicit. Thanks for the references to patch in HDFS-311 .. will take a look.
        Hide
        Luca Telloli added a comment -

        Raghu, I acknowledge that there might be a problem with the current code, although unit tests for BackupNode pass correctly, and I implemented a few changes that should fix this issue. The problematic lines, as you correctly pointed out, are:

        @@ -295,7 +295,7 @@
        -    editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
        +    editLog.divertFileStreams();
        
        @@ -363,7 +363,7 @@
        -    editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
        +    editLog.purgeEditLog();
        

        From what I understand, the BackupStorage wants to write into a subdirectory of a storage directory that is not STORAGE_DIR_CURRENT, but STORAGE_SPOOL_DIR.

        Given the original code, and reading current trunk, what will happen in divertFileStreams is the following:

        • the code iterates over all current storage directories of type EDITS and open streams of type FILE
        • check if the streams matches the directory
        • then for each stream { - closes the stream - creates a new stream, that in this case would point to STORAGE_SPOOL_DIR/STORAGE_SPOOL_FILE - replaces the old stream with the new stream in the list of streams }

        On revertFileStream, what happens is the following:

        • the code iterates over all current storage directories of type EDITS and open streams of type FILE
        • check if the streams matches the directory
        • then for each stream { - closes STORAGE_SPOOL_FILE/edits.new - renames the parameter source to CURRENT/edits }

        I'm not sure why the file in the STORAGE_SPOOL_DIR gets renamed to the edits file in the STORAGE_DIR_CURRENT, but I made a couple of modifications to my code that will implement that same mechanism.

        I understand though that the mechanism of writing into a subdirectory is due to the fact that, even if BackupNode and Namenode run on different machine, they might be writing on a shared storage directory, for instance on NFS, which I guess is a concrete example of when the test could fail, but not directly implementable in the test itself.

        My modifications include adding a JournalType parameter to the divertFileStreams() and purgeEditLog() methods, as follows:

        -    editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
        +    editLog.divertFileStreams(JournalType.BACKUP);
        

        and

        -    editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE);
        +    editLog.purgeEditLog(JournalType.BACKUP);
        

        which is then used in the EditLogFileOutputStream class to take the correct decisions. The final strings in BackupStorage have been made protected so I can read them from the EditLogFileOutputStream class.

        So I'd say that this is still a reorganization of code, if minor I'll let you to decide, and there's no change in semantics implied or expected. Unit tests pass fine on Linux.

        It is very unrealistic to expect tests to catch every random bug and typo in a patch (for anything I have ever worked on or known.

        I agree with you, but that's not what I said.

        Show
        Luca Telloli added a comment - Raghu, I acknowledge that there might be a problem with the current code, although unit tests for BackupNode pass correctly, and I implemented a few changes that should fix this issue. The problematic lines, as you correctly pointed out, are: @@ -295,7 +295,7 @@ - editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE); + editLog.divertFileStreams(); @@ -363,7 +363,7 @@ - editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE); + editLog.purgeEditLog(); From what I understand, the BackupStorage wants to write into a subdirectory of a storage directory that is not STORAGE_DIR_CURRENT, but STORAGE_SPOOL_DIR. Given the original code, and reading current trunk, what will happen in divertFileStreams is the following: the code iterates over all current storage directories of type EDITS and open streams of type FILE check if the streams matches the directory then for each stream { - closes the stream - creates a new stream, that in this case would point to STORAGE_SPOOL_DIR/STORAGE_SPOOL_FILE - replaces the old stream with the new stream in the list of streams } On revertFileStream, what happens is the following: the code iterates over all current storage directories of type EDITS and open streams of type FILE check if the streams matches the directory then for each stream { - closes STORAGE_SPOOL_FILE/edits.new - renames the parameter source to CURRENT/edits } I'm not sure why the file in the STORAGE_SPOOL_DIR gets renamed to the edits file in the STORAGE_DIR_CURRENT, but I made a couple of modifications to my code that will implement that same mechanism. I understand though that the mechanism of writing into a subdirectory is due to the fact that, even if BackupNode and Namenode run on different machine, they might be writing on a shared storage directory, for instance on NFS, which I guess is a concrete example of when the test could fail, but not directly implementable in the test itself. My modifications include adding a JournalType parameter to the divertFileStreams() and purgeEditLog() methods, as follows: - editLog.divertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE); + editLog.divertFileStreams(JournalType.BACKUP); and - editLog.revertFileStreams(STORAGE_JSPOOL_DIR + "/" + STORAGE_JSPOOL_FILE); + editLog.purgeEditLog(JournalType.BACKUP); which is then used in the EditLogFileOutputStream class to take the correct decisions. The final strings in BackupStorage have been made protected so I can read them from the EditLogFileOutputStream class. So I'd say that this is still a reorganization of code, if minor I'll let you to decide, and there's no change in semantics implied or expected. Unit tests pass fine on Linux. It is very unrealistic to expect tests to catch every random bug and typo in a patch (for anything I have ever worked on or known. I agree with you, but that's not what I said.
        Luca Telloli made changes -
        Attachment HDFS-397.patch [ 12413555 ]
        Konstantin Shvachko made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Gavin made changes -
        Link This issue blocks HDFS-234 [ HDFS-234 ]
        Gavin made changes -
        Link This issue is depended upon by HDFS-234 [ HDFS-234 ]

          People

          • Assignee:
            Luca Telloli
            Reporter:
            Luca Telloli
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development