Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None

      Description

      Process the value of property dfs.name.edits.dirs as URI, to allow different schemes than just file. As an advantage, Java supports the constructor File(URI) so the transition is straightforward for files.

      1. HADOOP-5832.patch
        29 kB
        Luca Telloli
      2. HADOOP-5832.patch
        31 kB
        Luca Telloli
      3. HADOOP-5832.patch
        33 kB
        Luca Telloli
      4. HADOOP-5832.patch
        31 kB
        Luca Telloli
      5. HDFS-396.patch
        31 kB
        Raghu Angadi

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #25 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/25/)
          HDFS-463. CreateEditLog utility broken after (URI for
          FSImage). (Suresh Srinivas via rangadi)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #25 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/25/ ) HDFS-463 . CreateEditLog utility broken after (URI for FSImage). (Suresh Srinivas via rangadi)
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Flavio, just have tried HDFS-456 patah. It works in Windows. Thanks for the info.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Flavio, just have tried HDFS-456 patah. It works in Windows. Thanks for the info.
          Hide
          Flavio Junqueira added a comment -

          Nicholas, Have you tried the patch in HDFS-456?

          Show
          Flavio Junqueira added a comment - Nicholas, Have you tried the patch in HDFS-456 ?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Talked to Jakob. Look like that the patch committed causes the problem described in here.

          Show
          Tsz Wo Nicholas Sze added a comment - Talked to Jakob. Look like that the patch committed causes the problem described in here .
          Hide
          Raghu Angadi added a comment -

          I just commited this. Thanks Luca.

          Show
          Raghu Angadi added a comment - I just commited this. Thanks Luca.
          Hide
          Raghu Angadi added a comment -

          Luca, with the project split, I am not sure how Hudson patch queue is being processed. When hudson is backlogged, Hadoop requires developers to run 'ant test-patch' and unit tests. Could you run the unit tests with the patch (with the new patch or yours). If you are not familiar with 'test-patch' I can run it.

          Show
          Raghu Angadi added a comment - Luca, with the project split, I am not sure how Hudson patch queue is being processed. When hudson is backlogged, Hadoop requires developers to run 'ant test-patch' and unit tests. Could you run the unit tests with the patch (with the new patch or yours). If you are not familiar with 'test-patch' I can run it.
          Hide
          Luca Telloli added a comment -

          looks good thanks!

          Show
          Luca Telloli added a comment - looks good thanks!
          Hide
          Raghu Angadi added a comment -

          +1. The patch looks good.

          I am attaching the patch for new hdfs trunk.

          Show
          Raghu Angadi added a comment - +1. The patch looks good. I am attaching the patch for new hdfs trunk.
          Hide
          Luca Telloli added a comment -

          Raghu, thanks for the comments they're helpful to me for making good progress!

          • I factored out the repeated code
          • I added a consistency check also for properties fs.checkpoint.dir and fs.checkpoint.edits.dir that wasn't currently in place
          Show
          Luca Telloli added a comment - Raghu, thanks for the comments they're helpful to me for making good progress! I factored out the repeated code I added a consistency check also for properties fs.checkpoint.dir and fs.checkpoint.edits.dir that wasn't currently in place
          Hide
          Raghu Angadi added a comment -

          Thanks for the changes. Looks better.

          nit: Could you move the block of code that checks for consistency (including the check to see if it file://) a method? It is repeated for dirs and edits (it around 20 lines).

          Show
          Raghu Angadi added a comment - Thanks for the changes. Looks better. nit: Could you move the block of code that checks for consistency (including the check to see if it file:// ) a method? It is repeated for dirs and edits (it around 20 lines).
          Hide
          Luca Telloli added a comment - - edited

          > Could you confirm if any of the error handling is different compared current trunk? If there are any differences could you list them here?

          My approach has been to catch any new exception and throw a new IOException with its content

          > Please update hdfs-default.xml for dfs.name.dir. Mainly the description needs to be updated for new syntax.

          Done

          > May be a warning when there is no schema would help get configs updated.

          Done

          > Indentation "fixes" : I don't see any reason do indentation fixes far away from actual changes for this patch. Some of these are not even fixes. A good rule of thumb I follow is that if I have a hunk in the patch that has only these changes, I remove it from my patch. That is easy to do. On the plus side, you won't get 'svn blame'd for bugs around there .

          It's a consequence of using Eclipse in a different configuration. Just setting the values of tabstop to 2 and whitespace instead of tab proved not to be sufficient to avoid these situations. Although I tried to fix most of them manually, there might be a couple of leftovers.

          Finally, I think the approach should be:

          • non URI: consider it valid, use as file:// and warn the user, as you suggested
          • unknown uri scheme: the scheme should be among the ones specified inside the JournalType enumeration, otherwise hard error

          I'm posting a new patch that fixes the issues you stated above, please let me know your impressions on it

          Show
          Luca Telloli added a comment - - edited > Could you confirm if any of the error handling is different compared current trunk? If there are any differences could you list them here? My approach has been to catch any new exception and throw a new IOException with its content > Please update hdfs-default.xml for dfs.name.dir. Mainly the description needs to be updated for new syntax. Done > May be a warning when there is no schema would help get configs updated. Done > Indentation "fixes" : I don't see any reason do indentation fixes far away from actual changes for this patch. Some of these are not even fixes. A good rule of thumb I follow is that if I have a hunk in the patch that has only these changes, I remove it from my patch. That is easy to do. On the plus side, you won't get 'svn blame'd for bugs around there . It's a consequence of using Eclipse in a different configuration. Just setting the values of tabstop to 2 and whitespace instead of tab proved not to be sufficient to avoid these situations. Although I tried to fix most of them manually, there might be a couple of leftovers. Finally, I think the approach should be: non URI: consider it valid, use as file:// and warn the user, as you suggested unknown uri scheme: the scheme should be among the ones specified inside the JournalType enumeration, otherwise hard error I'm posting a new patch that fixes the issues you stated above, please let me know your impressions on it
          Hide
          Raghu Angadi added a comment -

          Luca,

          Could you confirm if any of the error handling is different compared current trunk? If there are any differences could you list them here? It is harder reviewers alone to do this. Could you confirm that things that caused a hard error still does and vice versa.
          For e.g:

                try {
                  URI u = new URI(name);
                  // If the scheme was not declared, default to file://
                  // and use the absolute path of the file 
                  if(u.getScheme() == null)
                    u = new URI("file://" + new File(name).getAbsolutePath());
                  dirs.add(u);
                } catch (Exception e) {
                  LOG.error("Error while processing URI: " + name + 
                      ". The error message was: " + e.getMessage());
                }
          

          Edits and FSImage directories are very critical part of NN operation. Ignoring error like this might not be good.

          Other comments :

          1. Please update hdfs-default.xml for dfs.name.dir. Mainly the description needs to be updated for new syntax.
          2. May be a warning when there is no schema would help get configs updated.
          3. Indentation "fixes" : I don't see any reason do indentation fixes far away from actual changes for this patch. Some of these are not even fixes. A good rule of thumb I follow is that if I have a hunk in the patch that has only these changes, I remove it from my patch. That is easy to do. On the plus side, you won't get 'svn blame'd for bugs around there .
          Show
          Raghu Angadi added a comment - Luca, Could you confirm if any of the error handling is different compared current trunk? If there are any differences could you list them here? It is harder reviewers alone to do this. Could you confirm that things that caused a hard error still does and vice versa. For e.g: try { URI u = new URI(name); // If the scheme was not declared, default to file:// // and use the absolute path of the file if(u.getScheme() == null) u = new URI("file://" + new File(name).getAbsolutePath()); dirs.add(u); } catch (Exception e) { LOG.error("Error while processing URI: " + name + ". The error message was: " + e.getMessage()); } Edits and FSImage directories are very critical part of NN operation. Ignoring error like this might not be good. Other comments : Please update hdfs-default.xml for dfs.name.dir . Mainly the description needs to be updated for new syntax. May be a warning when there is no schema would help get configs updated. Indentation "fixes" : I don't see any reason do indentation fixes far away from actual changes for this patch. Some of these are not even fixes. A good rule of thumb I follow is that if I have a hunk in the patch that has only these changes, I remove it from my patch. That is easy to do. On the plus side, you won't get 'svn blame'd for bugs around there .
          Hide
          Flavio Junqueira added a comment -

          Raghu, being able to specify non-file URIs is part of the point of this patch. As Luca is working on it towards the integration of BookKeeper (HADOOP-5189), we would like to eventually be able to specify a URI for non-file types of logging such as BookKeeper.

          Show
          Flavio Junqueira added a comment - Raghu, being able to specify non-file URIs is part of the point of this patch. As Luca is working on it towards the integration of BookKeeper ( HADOOP-5189 ), we would like to eventually be able to specify a URI for non-file types of logging such as BookKeeper.
          Hide
          Raghu Angadi added a comment -

          What happens if someone specifies a non-file URI with this patch? I would say any unsported URI should result in a hard error.

          Show
          Raghu Angadi added a comment - What happens if someone specifies a non-file URI with this patch? I would say any unsported URI should result in a hard error.
          Hide
          Flavio Junqueira added a comment -

          There is an issue open about TestHdfsProxy failing (HADOOP-6007), so it sounds like it is a problem with the test.

          If no one else has any other comment on the patch, it would be great to have it committed.

          Show
          Flavio Junqueira added a comment - There is an issue open about TestHdfsProxy failing ( HADOOP-6007 ), so it sounds like it is a problem with the test. If no one else has any other comment on the patch, it would be great to have it committed.
          Hide
          Luca Telloli added a comment -

          I checked the two tests which failed in hudson:

          • mapred/TestReduceFetch failed on my node with trunk both patched and unpatched, so the problem seems related to some previous issue. Btw, on Hudson, this test was timing out, on my box it's failing due to an error.
          • contrib/TestHdfsProxy passes without issues on my node; as far a I can see this test has been failing on other patches previous to this one so again I'm not sure if its outcome is related to this patch at all
          Show
          Luca Telloli added a comment - I checked the two tests which failed in hudson: mapred/TestReduceFetch failed on my node with trunk both patched and unpatched, so the problem seems related to some previous issue. Btw, on Hudson, this test was timing out, on my box it's failing due to an error. contrib/TestHdfsProxy passes without issues on my node; as far a I can see this test has been failing on other patches previous to this one so again I'm not sure if its outcome is related to this patch at all
          Hide
          Raghu Angadi added a comment -

          > It is true that the patch is spread out, but most of it replaces the use of File with URI, and touching multiple files seems necessary to add the proposed functionality. I don't see a good way around it, so it is fine with me.

          Right. I meant since it is a spread out patch, indentation is more important.

          Show
          Raghu Angadi added a comment - > It is true that the patch is spread out, but most of it replaces the use of File with URI, and touching multiple files seems necessary to add the proposed functionality. I don't see a good way around it, so it is fine with me. Right. I meant since it is a spread out patch, indentation is more important.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12410084/HADOOP-5832.patch
          against trunk revision 782708.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 23 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12410084/HADOOP-5832.patch against trunk revision 782708. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 23 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/481/console This message is automatically generated.
          Hide
          Luca Telloli added a comment -

          resubmitting since it looks there's issues with hudson

          Show
          Luca Telloli added a comment - resubmitting since it looks there's issues with hudson
          Hide
          Flavio Junqueira added a comment -

          +1, it looks good to me. I've only been able to find 3 instances in which this patch replaces an existing tab with the correct indentation. HDFS tests pass fine for me. Thanks, Luca!

          Show
          Flavio Junqueira added a comment - +1, it looks good to me. I've only been able to find 3 instances in which this patch replaces an existing tab with the correct indentation. HDFS tests pass fine for me. Thanks, Luca!
          Hide
          Luca Telloli added a comment -

          Addressed the issue of previous comments; in particular:

          • fixed the failing test
          • fixed the indentation issue; since most of the work have been done manually, there's a couple of places where the code doesn't change, but only the indentation; I apologize in advance for those lines but at the same time I don't think I should revert them, since the bad indentation was there before the patch, and it's resolved afterwards. Additionally, as I said, the number of lines where this happens is very limited.
          Show
          Luca Telloli added a comment - Addressed the issue of previous comments; in particular: fixed the failing test fixed the indentation issue; since most of the work have been done manually, there's a couple of places where the code doesn't change, but only the indentation; I apologize in advance for those lines but at the same time I don't think I should revert them, since the bad indentation was there before the patch, and it's resolved afterwards. Additionally, as I said, the number of lines where this happens is very limited.
          Hide
          Luca Telloli added a comment -

          cancelling for later resubmission due to latest comments

          Show
          Luca Telloli added a comment - cancelling for later resubmission due to latest comments
          Hide
          Flavio Junqueira added a comment -

          On the comments from Raghu:

          • -1 on indentation. I'm seeing tabs in some included lines, and some statements seem misplaced, like the "finally" declaration in TestBackupNode.java;
          • It is true that the patch is spread out, but most of it replaces the use of File with URI, and touching multiple files seems necessary to add the proposed functionality. I don't see a good way around it, so it is fine with me.

          I have run the hdfs tests, and it passes all but one: Test org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer

          It gives me the following error:

          Testcase: testOIV took 7.667 sec
          	Caused an ERROR
          null
          java.lang.ArrayStoreException
          	at java.lang.System.arraycopy(Native Method)
          	at java.util.Arrays.copyOf(Arrays.java:2763)
          	at java.util.ArrayList.toArray(ArrayList.java:305)
          	at org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.initFsimage(TestOfflineImageViewer.java:127)
          	at org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testOIV(TestOfflineImageViewer.java:77)
          

          Inspecting the code, we see that line 127 of the test is:

          TestOfflineImageViewer.java
          File [] files = cluster.getNameDirs().toArray(new File[0]);
          

          which looks like it could be caused by the changes of this patch. In fact, without the patch, this test runs fine for me.

          Show
          Flavio Junqueira added a comment - On the comments from Raghu: -1 on indentation. I'm seeing tabs in some included lines, and some statements seem misplaced, like the "finally" declaration in TestBackupNode.java; It is true that the patch is spread out, but most of it replaces the use of File with URI, and touching multiple files seems necessary to add the proposed functionality. I don't see a good way around it, so it is fine with me. I have run the hdfs tests, and it passes all but one: Test org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer It gives me the following error: Testcase: testOIV took 7.667 sec Caused an ERROR null java.lang.ArrayStoreException at java.lang.System.arraycopy(Native Method) at java.util.Arrays.copyOf(Arrays.java:2763) at java.util.ArrayList.toArray(ArrayList.java:305) at org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.initFsimage(TestOfflineImageViewer.java:127) at org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testOIV(TestOfflineImageViewer.java:77) Inspecting the code, we see that line 127 of the test is: TestOfflineImageViewer.java File [] files = cluster.getNameDirs().toArray( new File[0]); which looks like it could be caused by the changes of this patch. In fact, without the patch, this test runs fine for me.
          Hide
          Raghu Angadi added a comment -

          This is a pretty spread out patch. Could you fix the indentation in multiple places? There are patch guidelines on the wiki, but two most important ones are :

          1. use spaces instead of tabs, and
          2. use indentation of 2.

          It would matter less if the patch was a small isolated change.

          Show
          Raghu Angadi added a comment - This is a pretty spread out patch. Could you fix the indentation in multiple places? There are patch guidelines on the wiki, but two most important ones are : use spaces instead of tabs, and use indentation of 2. It would matter less if the patch was a small isolated change.
          Hide
          Benjamin Reed added a comment -

          +1 looks good

          Show
          Benjamin Reed added a comment - +1 looks good
          Hide
          Luca Telloli added a comment - - edited

          Attaching a specific patch for this issue, since HADOOP-5188 has not been fully agreed so far. This should also help the review process

          This patch allows passing arguments of the properties dfs.name.dir and dfs.name.edits.dir as URI, along with fs.checkpoint.dir and fs.checkpoint.edits.dir.
          The patch supports current (plain) directories as values, assuming that they refer to a file:// type of URI.

          This patch should also help HADOOP-5188, for the deployment of different types of logging than files.

          Show
          Luca Telloli added a comment - - edited Attaching a specific patch for this issue, since HADOOP-5188 has not been fully agreed so far. This should also help the review process This patch allows passing arguments of the properties dfs.name.dir and dfs.name.edits.dir as URI, along with fs.checkpoint.dir and fs.checkpoint.edits.dir. The patch supports current (plain) directories as values, assuming that they refer to a file:// type of URI. This patch should also help HADOOP-5188 , for the deployment of different types of logging than files.
          Hide
          Konstantin Shvachko added a comment -
          Show
          Konstantin Shvachko added a comment - This comment is relevant to this issue. http://issues.apache.org/jira/browse/HADOOP-5188#action_12711913
          Hide
          dhruba borthakur added a comment -

          This looks like a good one to have !

          Show
          dhruba borthakur added a comment - This looks like a good one to have !

            People

            • Assignee:
              Luca Telloli
              Reporter:
              Luca Telloli
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development