Sqoop
  1. Sqoop
  2. SQOOP-1033

CombineFileInputFormat does not work with paths not on default FS like ASV

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.4
    • Fix Version/s: 1.4.4
    • Component/s: None
    • Labels:
      None

      Description

      CombineFileInputFormat does not work with ASV. This appeared as an issue in Sqoop which failed to export files in ASV. CombineFileInputFormat strips out the scheme and authority components of the path, after which point the path is assumed to be on the default file system.
      Sqoop has it own copy of CombineFileInputFormat, but updated the ones in core as well for consistency.

      There are currently already solved Jiras for the same issue:
      https://issues.apache.org/jira/browse/MAPREDUCE-2704
      https://issues.apache.org/jira/browse/MAPREDUCE-1806

      1. SQOOP-1033.1.patch
        0.7 kB
        Shuaishuai Nie
      2. SQOOP-1033.2.patch
        0.8 kB
        Shuaishuai Nie
      3. SQOOP-1033.3.patch
        0.8 kB
        Shuaishuai Nie

        Issue Links

          Activity

          Hide
          Shuaishuai Nie added a comment -

          This patch also keep CombineFileInputFormat class consistent with Hadoop

          Show
          Shuaishuai Nie added a comment - This patch also keep CombineFileInputFormat class consistent with Hadoop
          Hide
          Venkat Ranganathan added a comment -

          Thanks for the fix - something that I was planning to do, but did not get around to do. Can you add comments describing the issue and the need for preserving the scheme and authority in the URI to work with other filesystems? And adding a review board will be helpful also.

          Show
          Venkat Ranganathan added a comment - Thanks for the fix - something that I was planning to do, but did not get around to do. Can you add comments describing the issue and the need for preserving the scheme and authority in the URI to work with other filesystems? And adding a review board will be helpful also.
          Hide
          Shuaishuai Nie added a comment -

          Thanks Venkat. CombineFileInputFormat strips out the scheme and authority components of the path, after which point the path is assumed to be on the default file system and cannot locate the file correctly on non-default file system. Since Sqoop and Hadoop share the same code for the CombineFileInputFormat and the Hadoop version has already made the related fix, the same fix should also be apply to the CombineFileInputFormat in Sqoop. The review is available here https://reviews.apache.org/r/10988/

          Show
          Shuaishuai Nie added a comment - Thanks Venkat. CombineFileInputFormat strips out the scheme and authority components of the path, after which point the path is assumed to be on the default file system and cannot locate the file correctly on non-default file system. Since Sqoop and Hadoop share the same code for the CombineFileInputFormat and the Hadoop version has already made the related fix, the same fix should also be apply to the CombineFileInputFormat in Sqoop. The review is available here https://reviews.apache.org/r/10988/
          Hide
          Venkat Ranganathan added a comment -

          Thanks Shuaishuai Nie
          I understand why you did, but was just stating that it would be good to add the comment in the code so that people may not be tempted to optimize it away in the future by mistake.

          Show
          Venkat Ranganathan added a comment - Thanks Shuaishuai Nie I understand why you did, but was just stating that it would be good to add the comment in the code so that people may not be tempted to optimize it away in the future by mistake.
          Hide
          Venkat Ranganathan added a comment -

          I added the review board as a weblink to this issue. In future please do so.

          Show
          Venkat Ranganathan added a comment - I added the review board as a weblink to this issue. In future please do so.
          Hide
          Shuaishuai Nie added a comment -

          Thanks for the advice Venkat, will do it next time. The comment is added in the patch

          Show
          Shuaishuai Nie added a comment - Thanks for the advice Venkat, will do it next time. The comment is added in the patch
          Hide
          Shuaishuai Nie added a comment -

          fix the format of the patch

          Show
          Shuaishuai Nie added a comment - fix the format of the patch
          Hide
          Jarek Jarcec Cecho added a comment -

          Assigning to Shuaishuai Nie.

          Show
          Jarek Jarcec Cecho added a comment - Assigning to Shuaishuai Nie .
          Hide
          Jarek Jarcec Cecho added a comment -

          The patch is in: https://git-wip-us.apache.org/repos/asf?p=sqoop.git;a=commit;h=fd756a0c403e2d3c982096b165e53c2fefe8f31f

          Thank you Shuaishuai for your contribution!

          Jarcec

          Show
          Jarek Jarcec Cecho added a comment - The patch is in: https://git-wip-us.apache.org/repos/asf?p=sqoop.git;a=commit;h=fd756a0c403e2d3c982096b165e53c2fefe8f31f Thank you Shuaishuai for your contribution! Jarcec
          Hide
          Shuaishuai Nie added a comment -

          Thanks Jarek

          Show
          Shuaishuai Nie added a comment - Thanks Jarek

            People

            • Assignee:
              Shuaishuai Nie
              Reporter:
              Shuaishuai Nie
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development