Uploaded image for project: 'Crunch'
  1. Crunch
  2. CRUNCH-622

From.avroFile fails if path not on default filesystem

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.13.0, 0.14.0
    • Fix Version/s: 0.15.0
    • Component/s: Core
    • Labels:
      None

      Description

          MemPipeline.getInstance().read(From.avroFile(new Path("s3:///something")));
      

      Fails with:

      java.lang.IllegalArgumentException: Wrong FS: s3:/something, expected: file:///
      
      	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
      	at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
      	at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:519)
      	at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737)
      	at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514)
      	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
      	at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1424)
      	at org.apache.crunch.io.From.getSchemaFromPath(From.java:351)
      	at org.apache.crunch.io.From.avroFile(From.java:306)
      	at org.apache.crunch.io.From.avroFile(From.java:280)
      

      I noticed this in the From class, method getSchemaFromPath:

            FileSystem fs = FileSystem.get(conf);
      

      Shouldn't that be changed to this?

            FileSystem fs = path.getFileSystem(conf);
      

      We ran into this in a usecase where the file was on a valid path on S3 but the Configuration was pointing to HDFS, which I believe should just work.

      After some googling, I also found CRUNCH-47 which seems related, but the patch there couldn't fix the From/At/To helpers as they were introduced later...

        Attachments

        1. CRUNCH-622.patch
          1.0 kB
          Micah Whitacre

          Activity

            People

            • Assignee:
              mkwhitacre Micah Whitacre
              Reporter:
              tomdeleu Tom De Leu
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: