Hadoop Common
  1. Hadoop Common
  2. HADOOP-6537

Proposal for exceptions thrown by FileContext and Abstract File System

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Detailed exceptions declared in FileContext and AbstractFileSystem

      Description

      Currently the APIs in FileContext throw only IOException. Going forward these APIs will throw more specific exceptions.
      This jira proposes following hierarchy of exceptions to be thrown by FileContext and AFS (Abstract File System) classes.

      InterruptedException (java.lang.InterruptedException)

      IOException
      /* Following exceptions extend IOException */
      FileNotFoundException
      FileAlreadyExistsException
      DirectoryNotEmptyException
      NotDirectoryException
      AccessDeniedException
      IsDirectoryException
      InvalidPathNameException

      FileSystemException
      /* Following exceptions extend FileSystemException */
      FileSystemNotReadyException
      ReadOnlyFileSystemException
      QuotaExceededException
      OutOfSpaceException

      RemoteException (java.rmi.RemoteException)

      Most of the IOExceptions above are caused by invalid user input, while FileSystemException is thrown when FS is in such a state that the requested operation cannot proceed.
      Please note that the proposed RemoteException is from standard java rmi package, which also extends IOException.

      HDFS throws many exceptions which are not in the above list. The DFSClient will unwrap the exceptions thrown by HDFS, and any exception not in the above list will be thrown as IOException or FileSystemException.

      1. hdfs-717.4.patch
        104 kB
        Suresh Srinivas
      2. hdfs-717.3.patch
        99 kB
        Suresh Srinivas
      3. hdfs-717.2.patch
        7 kB
        Suresh Srinivas
      4. hdfs-717.1.patch
        64 kB
        Suresh Srinivas
      5. hdfs-717.patch
        77 kB
        Suresh Srinivas
      6. hdfs-717.patch
        74 kB
        Suresh Srinivas

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #266 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/266/)
          Declare more detailed exceptions in FileContext and AbstractFileSystem
          (Suresh Srinivas via Sanjay Radia)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #266 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/266/ ) Declare more detailed exceptions in FileContext and AbstractFileSystem (Suresh Srinivas via Sanjay Radia)
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #193 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/193/)
          Declare more detailed exceptions in FileContext and AbstractFileSystem
          (Suresh Srinivas via Sanjay Radia)

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #193 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/193/ ) Declare more detailed exceptions in FileContext and AbstractFileSystem (Suresh Srinivas via Sanjay Radia)
          Hide
          Sanjay Radia added a comment -

          +1 Committed. Thanks Suresh

          Show
          Sanjay Radia added a comment - +1 Committed. Thanks Suresh
          Hide
          Suresh Srinivas added a comment -

          Tests are not included because comprehensive tests need HDFS changes. Tracking it in a separate jira Hadoop-6595.

          Show
          Suresh Srinivas added a comment - Tests are not included because comprehensive tests need HDFS changes. Tracking it in a separate jira Hadoop-6595.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12436751/hdfs-717.4.patch
          against trunk revision 915168.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12436751/hdfs-717.4.patch against trunk revision 915168. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/370/console This message is automatically generated.
          Hide
          Suresh Srinivas added a comment -

          exists(f) - should it return false instead of UnsupportedFileSystemException?
          > Also returns false on UnresolvedLinkException

          listStatus(all variations), globStatus(??) - add the UnresolvedLinkE
          > Added this to HADOOP-6556 and will be addressed in further FileContext API cleanup.

          Show
          Suresh Srinivas added a comment - exists(f) - should it return false instead of UnsupportedFileSystemException? > Also returns false on UnresolvedLinkException listStatus(all variations), globStatus(??) - add the UnresolvedLinkE > Added this to HADOOP-6556 and will be addressed in further FileContext API cleanup.
          Hide
          Sanjay Radia added a comment -
          • AbstractFileSystem
            • createFileSystem - you removed the javadoc for the args.
            • fix the spelling error: throw new HadoopIllegalArgumentException("Unkown CreateOpts of type " +
            • createInternal - missing FileNotFoundException, ParentNotDirectory
            • create, mkdir - missing ParentNotDirectory
            • getFileLinkStatus - missing FileNotFoundException
            • setVerifyChecksum - should not throw FileNotFoundException
          • FileContext
            • getFileLinkStatus and getLinkTarget - missing FileNotFoundException, AccessDeniesEx, FileSystemNotSupportedEx
            • mkdir - javadoc, ist line:
              • "Make the given file and all non-existent parents into directories."
                Change to
              • "Make(create) a directory"
            • createSymlink - missing detailed exceptions - same as create.
            • exists(f) - should it return false instead of UnsupportedFileSystemException?
            • getContentSummary, listStatus(all variations), globStatus(??) - add the UnresolvedLinkE
            • copy - ParentNotDir, FileNotFound, UnresolvedLinkE
          Show
          Sanjay Radia added a comment - AbstractFileSystem createFileSystem - you removed the javadoc for the args. fix the spelling error: throw new HadoopIllegalArgumentException("Unkown CreateOpts of type " + createInternal - missing FileNotFoundException, ParentNotDirectory create, mkdir - missing ParentNotDirectory getFileLinkStatus - missing FileNotFoundException setVerifyChecksum - should not throw FileNotFoundException FileContext getFileLinkStatus and getLinkTarget - missing FileNotFoundException, AccessDeniesEx, FileSystemNotSupportedEx mkdir - javadoc, ist line: "Make the given file and all non-existent parents into directories." Change to "Make(create) a directory" createSymlink - missing detailed exceptions - same as create. exists(f) - should it return false instead of UnsupportedFileSystemException? getContentSummary, listStatus(all variations), globStatus(??) - add the UnresolvedLinkE copy - ParentNotDir, FileNotFound, UnresolvedLinkE
          Hide
          Suresh Srinivas added a comment -

          New patch after merging latest trunk changes.

          Show
          Suresh Srinivas added a comment - New patch after merging latest trunk changes.
          Hide
          Suresh Srinivas added a comment -

          New patch has the following changes:

          1. New exception UnsupportedFileSystemException is added. This is a separate exception because FileContext resolves URI to a FileSystem on the fly. Any FileContext operation could fail to resolve URI to a file sytem. This merits a separate exception.
          2. All the comments from Sanjay are addressed.

          > FileContext#setVerifyChecksum - shouldn't it throw notSupportedException ? This is a subtle issue. For the convenience of testing one does not want the exception.
          This is documentation change only (UnsupportedOperationException is unchecked). Will do it in a separate jira, along with other changes.

          I have also covered some of the issues you have raised in HADOOP-6556.

          Show
          Suresh Srinivas added a comment - New patch has the following changes: New exception UnsupportedFileSystemException is added. This is a separate exception because FileContext resolves URI to a FileSystem on the fly. Any FileContext operation could fail to resolve URI to a file sytem. This merits a separate exception. All the comments from Sanjay are addressed. > FileContext#setVerifyChecksum - shouldn't it throw notSupportedException ? This is a subtle issue. For the convenience of testing one does not want the exception. This is documentation change only (UnsupportedOperationException is unchecked). Will do it in a separate jira, along with other changes. I have also covered some of the issues you have raised in HADOOP-6556 .
          Hide
          Suresh Srinivas added a comment -

          @Eli - I noticed FileContext#delete on error sometimes returns false and sometimes throws an IOException. Should we remove the boolean return value and always throw an IOException?

          This is the same case with FileContext#copy. I have created HADOOP-6556 to address it.

          Show
          Suresh Srinivas added a comment - @Eli - I noticed FileContext#delete on error sometimes returns false and sometimes throws an IOException. Should we remove the boolean return value and always throw an IOException? This is the same case with FileContext#copy. I have created HADOOP-6556 to address it.
          Hide
          Philip Zeyliger added a comment -

          @Suresh, if the current proposal isn't using FileSystemException, then my point is moot.

          All I was trying to say is that, when possible, we should avoid to have multiple classes named the same thing but in different packages. Specifically, having both java.io.FileSystemException and org.apache.hadoop.io.FileSystemException is a bit confusing because a casual user won't know which to import, and why they're different.

          – Philip

          Show
          Philip Zeyliger added a comment - @Suresh, if the current proposal isn't using FileSystemException, then my point is moot. All I was trying to say is that, when possible, we should avoid to have multiple classes named the same thing but in different packages. Specifically, having both java.io.FileSystemException and org.apache.hadoop.io.FileSystemException is a bit confusing because a casual user won't know which to import, and why they're different. – Philip
          Hide
          Suresh Srinivas added a comment -

          @Philip - Your comment is not very clear. Current proposal does not use FileSystemException. As regards to local and Hadoop, the APIs are the same. Not sure about the issue you have with it.

          Show
          Suresh Srinivas added a comment - @Philip - Your comment is not very clear. Current proposal does not use FileSystemException. As regards to local and Hadoop, the APIs are the same. Not sure about the issue you have with it.
          Hide
          Sanjay Radia added a comment -
          • I would declare the RPC exception right now and describe them in the javadoc of FileContext and AbstractFileSystem (i.e. if the file system is remote then it may throw an RPC exception). Later we can modify the rpc layer to throw that exception.
          Show
          Sanjay Radia added a comment - I would declare the RPC exception right now and describe them in the javadoc of FileContext and AbstractFileSystem (i.e. if the file system is remote then it may throw an RPC exception). Later we can modify the rpc layer to throw that exception.
          Hide
          Eli Collins added a comment -

          I noticed FileContext#delete on error sometimes returns false and sometimes throws an IOException. Should we remove the boolean return value and always throw an IOException?

          Show
          Eli Collins added a comment - I noticed FileContext#delete on error sometimes returns false and sometimes throws an IOException. Should we remove the boolean return value and always throw an IOException?
          Hide
          Sanjay Radia added a comment -
          • HadoopIllegalArgumentException - parameters should be final.
          • InvalidPathException - package should be fs not io (same as Path),
            Make InvalidPath a subclass of HadoopIllegalArgument; ie an unchecked exception
            • create and mkdir should document them in javadoc
            • Q. do the other methods throw InvalidPathException or FileNotFound in case of an illegal path?
          • CheckPath - javadoc add @throws HadoopIllegalArgumentException
          • Spelling typos: "acess"
          • FileContext - the internal private methods should also throw more specific exceptions
          • FileContext#create:
            FileNotFoundException - typo "parent of dir does not exist" and .. -> parent of f does not exist and ...
            Missing exception: ParentNotDirectoryException
            Remove InvalidPath as it is unchecked exception - see my comment above.
          • FileContext#mkdir: Missing exception: ParentNotDirectoryException
          • FileContext#rename
            Javadoc: @throws FileAlreadyExists if dst already exist ... ADD — and OVERWRITE is false.
          • FileContext#setVerifyChecksum
            shouldn't it throw notSupportedException ? This is a subtle issue. For the convenience of testing one does not want the exception.
          • While reading the patch I came up with some further cleanup of the FileContext APis - some of them are related to exceptions but others
            are not. We should file a separate Jira on this if we decide to change the spec.
            • isFile, isDir - should it throw notFoundException or return false if the path does not exist?
            • util#getFileStatus - we return a partial list if some of the paths are invalid or not accessible. Should we throw an exception if any of the paths are invalid or not accessible?
            • deleteOnExit should return void instead of boolean.
          Show
          Sanjay Radia added a comment - HadoopIllegalArgumentException - parameters should be final. InvalidPathException - package should be fs not io (same as Path), Make InvalidPath a subclass of HadoopIllegalArgument; ie an unchecked exception create and mkdir should document them in javadoc Q. do the other methods throw InvalidPathException or FileNotFound in case of an illegal path? CheckPath - javadoc add @throws HadoopIllegalArgumentException Spelling typos: "acess" FileContext - the internal private methods should also throw more specific exceptions FileContext#create: FileNotFoundException - typo "parent of dir does not exist" and .. -> parent of f does not exist and ... Missing exception: ParentNotDirectoryException Remove InvalidPath as it is unchecked exception - see my comment above. FileContext#mkdir: Missing exception: ParentNotDirectoryException FileContext#rename Javadoc: @throws FileAlreadyExists if dst already exist ... ADD — and OVERWRITE is false. FileContext#setVerifyChecksum shouldn't it throw notSupportedException ? This is a subtle issue. For the convenience of testing one does not want the exception. While reading the patch I came up with some further cleanup of the FileContext APis - some of them are related to exceptions but others are not. We should file a separate Jira on this if we decide to change the spec. isFile, isDir - should it throw notFoundException or return false if the path does not exist? util#getFileStatus - we return a partial list if some of the paths are invalid or not accessible. Should we throw an exception if any of the paths are invalid or not accessible? deleteOnExit should return void instead of boolean.
          Hide
          Philip Zeyliger added a comment -

          It's hard to tell where the current proposal is, but if FileSystemException is going to be in Java 1.7, it might be nice to not name Hadoop's exception by the same name (unless they're one and the same). Code that uses DFSClient might also be using the local file system using the regular Java APIs, and it might be nice to separate the two systems (local and Hadoop).

          Show
          Philip Zeyliger added a comment - It's hard to tell where the current proposal is, but if FileSystemException is going to be in Java 1.7, it might be nice to not name Hadoop's exception by the same name (unless they're one and the same). Code that uses DFSClient might also be using the local file system using the regular Java APIs, and it might be nice to separate the two systems (local and Hadoop).
          Hide
          Suresh Srinivas added a comment -

          Updating patch with missing files

          Show
          Suresh Srinivas added a comment - Updating patch with missing files
          Hide
          Suresh Srinivas added a comment -

          Changes in the attached patch:

          1. AbstractFileSystem and FileContext methods throw exceptions as proposed.
          2. Method javadoc updated to include description of the exceptions thrown
          Show
          Suresh Srinivas added a comment - Changes in the attached patch: AbstractFileSystem and FileContext methods throw exceptions as proposed. Method javadoc updated to include description of the exceptions thrown
          Hide
          Sanjay Radia added a comment -

          The proposal is to declare the precise set of exception ("Going forward these APIs will throw more specific exceptions").
          This is generally good practice and you are right that Avro needs this info. So we should declare the precise exceptions not just
          in FileContext but also in the protocol definition.

          Show
          Sanjay Radia added a comment - The proposal is to declare the precise set of exception ("Going forward these APIs will throw more specific exceptions"). This is generally good practice and you are right that Avro needs this info. So we should declare the precise exceptions not just in FileContext but also in the protocol definition.
          Hide
          Doug Cutting added a comment -

          To facilitate an eventual switch to Avro, protocols should specify the most precise set of exceptions in their throws clauses. Simply specifying IOException is not sufficient, since Avro will then serialize and deserialize IOException rather than a more specific subclass. Most RPC systems (Avro included) do not implement inheritance, so parameters, return types and exceptions in protocols should ideally be concrete, not base classes or interfaces.

          Show
          Doug Cutting added a comment - To facilitate an eventual switch to Avro, protocols should specify the most precise set of exceptions in their throws clauses. Simply specifying IOException is not sufficient, since Avro will then serialize and deserialize IOException rather than a more specific subclass. Most RPC systems (Avro included) do not implement inheritance, so parameters, return types and exceptions in protocols should ideally be concrete, not base classes or interfaces.
          Hide
          Sanjay Radia added a comment -

          The protocol methods (such as ClientProtocol) should declare/throw the RpcException.

          Show
          Sanjay Radia added a comment - The protocol methods (such as ClientProtocol) should declare/throw the RpcException.
          Hide
          Sanjay Radia added a comment -

          +1 on the proposal.

          Show
          Sanjay Radia added a comment - +1 on the proposal.
          Hide
          Suresh Srinivas added a comment -

          The following are the layers between the application and the Service Implementation (such as NameNode).
          Application <= Client library => <=RPC client=> <= Network => <= RPC server => <= Service Impl

          Key goals:

          1. InterruptedExceptions in the client library should not be ignored. This will help in clean application shutdown. InterruptedException on the server side should not be ignored; see below.
          2. Applications must be able to differentiate between RPC layer exception from the exceptions in the Service Impl. Applications can choose to retry a request based on different categories of exceptions received.
          3. Exceptions declared in the API should be propagated end to end over RPC from the server to the application. All undeclared exceptions from the Service Impl including InterruptedException should be handled by the RPC layer.
          4. Changes needed in applications to move to FileContext from FileSystem should be minimal.

          Proposal:
          Exceptions will be organized as shown below.

          1. IOException
            • exceptions as declared in the RPC API - note the detailed method exception will be declared even though they are a subclass of IOException
            • RPCException - exceptions in the rpc layer
              • RPCClientException - exception encountered in RPC client
              • RPCServerException - exception encountered in RPC server
              • UnexpectedServerException - unexpected exception from the Service Impl to RPC handlers.
          2. RunTimeException
            • HadoopIllegalArgumentException - sublcass of IllegalArgumentException indicates illegal or inappropriate argument.
            • HadoopInterruptedException - subclass of RunTimeException thrown on encountering InterruptedException.
            • UnsupportedOperationException - thrown to indicate the requested operation is not supported.

          Rationale:

          1. declared exception should be subclass of IOException as before - no changes here.
          2. group the rpc exceptions categorized by client side and server side.
          3. use runtime exception for InterrruptedException - simplifies migration to FileContext. Subclass of IOException not used as applications might have catch and ignore code.
          4. HadoopIllegalArgumentException instead of the java IllegalArgumentException - helps differentiate exception in Hadoop implementation from exception thrown from java libraries. Applications can choose to catch IllegalArgumentException.
          5. unsupported operation is indicated by unchecked UnsupportedOperationException - subclass of IOException not used as applications might have catch and ignore code. Using RunTimeException since applications cannot recover from this condition.

          Implementation details:
          InterruptedException handling:

          1. Client side changes
            • Client library (both API interface and RPC client) and InputStream and OutputStream returned by FileContext throw unchecked HadoopInterruptedException on InterruptedException.
          2. Server changes:
            • InterruptedException is currently ignored in the Service Impl layer. With this change the Service Impl will throw the exception. Methods in protocol classes such as ClientProtocol will specify InterruptedException in throws clause.
            • On InterruptedException, RPC handlers close the socket connection to the client. Client handles this failure same as loss of connection.

          RPC layer changes

          1. RPC layer marshalls HadoopInterruptedException, HadoopIllegalArgumentException, UnsupportedException from Service Impl all the way to the client.
          2. RPC layer throws RPCClientException, RPCServerException and UnexpectedServerException.

          FileContext and AbstractFileSystem and protocol changes:

          1. Methods in FileContext declare IOException and the relevant subclasses of IOExceptions. This helps document the specific exceptions thrown and in marshalling the exception from the server to application over RPC. RPCExceptions are not declared as thrown in FileContext and AbstractFileSystem, as some implementation might not use RPC layer (Local file system).
            example:
            public FSDataInputStream open(Path path) throws IOException, FileNotFoundException, AccessDeniedException;
            
          2. Protocol methods (such as ClientProtocol) will the throw exceptions similar to FileContext, along with InterruptedException.

          Finally the FileContext will throw the following exceptions. The exception hierarchy is flattened. The semantics remains as defined in the earlier comments.

          1. IOException
            • ServerNotReadyException (NameNode safemode etc)
            • OutOfSpaceException for write operations
            • AccessControlException
            • InvalidPathNameException
            • FileNotFoundException
            • FileAlreadyExistsException
            • DirectoryNotEmptyException
            • NotDirectoryException
            • DirectoryNotAllowedException
          Show
          Suresh Srinivas added a comment - The following are the layers between the application and the Service Implementation (such as NameNode). Application <= Client library => <=RPC client=> <= Network => <= RPC server => <= Service Impl Key goals: InterruptedExceptions in the client library should not be ignored. This will help in clean application shutdown. InterruptedException on the server side should not be ignored; see below. Applications must be able to differentiate between RPC layer exception from the exceptions in the Service Impl. Applications can choose to retry a request based on different categories of exceptions received. Exceptions declared in the API should be propagated end to end over RPC from the server to the application. All undeclared exceptions from the Service Impl including InterruptedException should be handled by the RPC layer. Changes needed in applications to move to FileContext from FileSystem should be minimal. Proposal: Exceptions will be organized as shown below. IOException exceptions as declared in the RPC API - note the detailed method exception will be declared even though they are a subclass of IOException RPCException - exceptions in the rpc layer RPCClientException - exception encountered in RPC client RPCServerException - exception encountered in RPC server UnexpectedServerException - unexpected exception from the Service Impl to RPC handlers. RunTimeException HadoopIllegalArgumentException - sublcass of IllegalArgumentException indicates illegal or inappropriate argument. HadoopInterruptedException - subclass of RunTimeException thrown on encountering InterruptedException. UnsupportedOperationException - thrown to indicate the requested operation is not supported. Rationale: declared exception should be subclass of IOException as before - no changes here. group the rpc exceptions categorized by client side and server side. use runtime exception for InterrruptedException - simplifies migration to FileContext. Subclass of IOException not used as applications might have catch and ignore code. HadoopIllegalArgumentException instead of the java IllegalArgumentException - helps differentiate exception in Hadoop implementation from exception thrown from java libraries. Applications can choose to catch IllegalArgumentException. unsupported operation is indicated by unchecked UnsupportedOperationException - subclass of IOException not used as applications might have catch and ignore code. Using RunTimeException since applications cannot recover from this condition. Implementation details: InterruptedException handling: Client side changes Client library (both API interface and RPC client) and InputStream and OutputStream returned by FileContext throw unchecked HadoopInterruptedException on InterruptedException. Server changes: InterruptedException is currently ignored in the Service Impl layer. With this change the Service Impl will throw the exception. Methods in protocol classes such as ClientProtocol will specify InterruptedException in throws clause. On InterruptedException, RPC handlers close the socket connection to the client. Client handles this failure same as loss of connection. RPC layer changes RPC layer marshalls HadoopInterruptedException, HadoopIllegalArgumentException, UnsupportedException from Service Impl all the way to the client. RPC layer throws RPCClientException, RPCServerException and UnexpectedServerException. FileContext and AbstractFileSystem and protocol changes: Methods in FileContext declare IOException and the relevant subclasses of IOExceptions. This helps document the specific exceptions thrown and in marshalling the exception from the server to application over RPC. RPCExceptions are not declared as thrown in FileContext and AbstractFileSystem, as some implementation might not use RPC layer (Local file system). example: public FSDataInputStream open(Path path) throws IOException, FileNotFoundException, AccessDeniedException; Protocol methods (such as ClientProtocol) will the throw exceptions similar to FileContext, along with InterruptedException. Finally the FileContext will throw the following exceptions. The exception hierarchy is flattened. The semantics remains as defined in the earlier comments. IOException ServerNotReadyException (NameNode safemode etc) OutOfSpaceException for write operations AccessControlException InvalidPathNameException FileNotFoundException FileAlreadyExistsException DirectoryNotEmptyException NotDirectoryException DirectoryNotAllowedException
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > On the FileSystemOperationException ...

          I am suggesting

          • open(...) throws SystemException, RemoteException, InterruptedIOException, AccessControlException, InvalidPathNameException, FileNotFoundException, IsDirectoryException

          SystemException, RemoteException and InterruptedIOException should be thrown by all methods. Then, the specific FileSystemOperationException subclasses are listed out explicitly. Other subclasses of IOException should never be thrown.

          > ... This will prevents us from using any existing IOExceptions like FileNotFoundException.

          I actually suggest create a new o.a.h.fs.FileNotFoundException, which extends FileSystemOperationException and prevent using java.io.FileNotFoundException because java.io.FileNotFoundException is overloaded with different meaning. The following is quoted from the java.io.FileNotFoundException API.

          It will also be thrown by these constructors if the file does exist but for some reason is inaccessible, for example when an attempt is made to open a read-only file for writing.

          Show
          Tsz Wo Nicholas Sze added a comment - > On the FileSystemOperationException ... I am suggesting open(...) throws SystemException, RemoteException, InterruptedIOException, AccessControlException, InvalidPathNameException, FileNotFoundException, IsDirectoryException SystemException, RemoteException and InterruptedIOException should be thrown by all methods. Then, the specific FileSystemOperationException subclasses are listed out explicitly. Other subclasses of IOException should never be thrown. > ... This will prevents us from using any existing IOExceptions like FileNotFoundException. I actually suggest create a new o.a.h.fs.FileNotFoundException, which extends FileSystemOperationException and prevent using java.io.FileNotFoundException because java.io.FileNotFoundException is overloaded with different meaning. The following is quoted from the java.io.FileNotFoundException API . It will also be thrown by these constructors if the file does exist but for some reason is inaccessible, for example when an attempt is made to open a read-only file for writing.
          Hide
          Jitendra Nath Pandey added a comment -

          > BTW this jira should have been a hadoop-common and not hadoop-hdfs jira.
          Agreed. The corresponding jira in hadoop-common is HADOOP-6361

          Show
          Jitendra Nath Pandey added a comment - > BTW this jira should have been a hadoop-common and not hadoop-hdfs jira. Agreed. The corresponding jira in hadoop-common is HADOOP-6361
          Hide
          Sanjay Radia added a comment -

          UnsupportedOperationException:

          This one is tricky. See my comment in the rename jira: https://issues.apache.org/jira/browse/HADOOP-6240?focusedCommentId=12756632&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12756632

          Basically the argument is that folks routinely write applications that are to be deployed on HDFS, but these apps are often tested on the local file system.
          Unfortunately, the unsupported exception will be thrown when testing the app.

          Further wrt to setReplication, I think our API needs improvement: the setReplication operation needs to be more abstract like "want redundancy" or "not want redundancy". No sure what the API should look like. A filesystem that uses raid can give you reliability by parity rather than replication. I don't think it is a good idea to have setReplication throw UnsupportedOpsException.

          Perhaps we should have methods like "IsFooOperationSupported()".

          I am raising issues rather then offering answers.

          Show
          Sanjay Radia added a comment - UnsupportedOperationException: This one is tricky. See my comment in the rename jira: https://issues.apache.org/jira/browse/HADOOP-6240?focusedCommentId=12756632&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12756632 Basically the argument is that folks routinely write applications that are to be deployed on HDFS, but these apps are often tested on the local file system. Unfortunately, the unsupported exception will be thrown when testing the app. Further wrt to setReplication, I think our API needs improvement: the setReplication operation needs to be more abstract like "want redundancy" or "not want redundancy". No sure what the API should look like. A filesystem that uses raid can give you reliability by parity rather than replication. I don't think it is a good idea to have setReplication throw UnsupportedOpsException. Perhaps we should have methods like "IsFooOperationSupported()". I am raising issues rather then offering answers.
          Hide
          Sanjay Radia added a comment -

          Nicolas> I propose to have the following hierarchy: ...
          +1 to use the name SystemException instead of FileSystemException as this can be reused in other client-server protocols.

          On the FileSystemOperationException - I guess you are trying to distinguish from the SystemException grouping and other
          non-file-system exceptions. So with your proposal will the method signature be:

          • open(...) throws IOException, FileSystemOperationException, SystemException, RemoteException, InterruptedIOException
            OR
          • open(...) throws FileSystemOperationException, SystemException, RemoteException, InterruptedIOException

          (i.e Are there any IOException thrown besides FileSystemOperationException, SystemException, RemoteException, InterruptedIOException?).
          This will prevents us from using any existing IOExceptions like FileNotFoundException.

          Show
          Sanjay Radia added a comment - Nicolas> I propose to have the following hierarchy: ... +1 to use the name SystemException instead of FileSystemException as this can be reused in other client-server protocols. On the FileSystemOperationException - I guess you are trying to distinguish from the SystemException grouping and other non-file-system exceptions. So with your proposal will the method signature be: open(...) throws IOException, FileSystemOperationException, SystemException, RemoteException, InterruptedIOException OR open(...) throws FileSystemOperationException, SystemException, RemoteException, InterruptedIOException (i.e Are there any IOException thrown besides FileSystemOperationException, SystemException, RemoteException, InterruptedIOException?). This will prevents us from using any existing IOExceptions like FileNotFoundException.
          Hide
          Sanjay Radia added a comment -

          >> I disagree here. We should unwrap only the ones that are declared in the method signature.
          >> RPC/network exceptions should be thrown as RemoteException

          >I also considered RemoteException as a container for passing exceptions through RPC. If the name-node throws an IOException the >DFSClient should unwrap it and throw the original directly rather than wrapped into the RemoteException.

          At this stage I don't have a strong opinion on what exception is used as the container across the wire. I also need to look into how avro is passing exceptions across the wire; will do so shortly. The RPC layer can be smarter about passing only declared exceptions and converting others into the remote exception in the RPC layer - I recall doing something like that in previous lifetime.

          If the RemoteException is the wrapper/container then yes client side should unwrap it, but only unwrap what is declared since these are the only classes available on the client side. The client side does not have all the Server-side exception classes.
          So if the server side throws a internal FooException that extends IOException the client side can only throw the IOException not the
          FooException.

          Show
          Sanjay Radia added a comment - >> I disagree here. We should unwrap only the ones that are declared in the method signature. >> RPC/network exceptions should be thrown as RemoteException >I also considered RemoteException as a container for passing exceptions through RPC. If the name-node throws an IOException the >DFSClient should unwrap it and throw the original directly rather than wrapped into the RemoteException. At this stage I don't have a strong opinion on what exception is used as the container across the wire. I also need to look into how avro is passing exceptions across the wire; will do so shortly. The RPC layer can be smarter about passing only declared exceptions and converting others into the remote exception in the RPC layer - I recall doing something like that in previous lifetime. If the RemoteException is the wrapper/container then yes client side should unwrap it, but only unwrap what is declared since these are the only classes available on the client side. The client side does not have all the Server-side exception classes. So if the server side throws a internal FooException that extends IOException the client side can only throw the IOException not the FooException.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          • The term FileSystemException is not clear whether it is a system problem or a file system operation problem. A better name may be SystemException. It also allow other systems like JobTracker to use it.
          • It may be better to group FileNotFoundException, FileAlreadyExistsException, etc. How about extending FileSystemOperationException?
          • I propose to have the following hierarchy:
            • java.io.IOException
              • java.rmi.RemoteException — network/ipc related exception
              • java.io.InterruptedIOException — the client thread is interrupted when it is waiting for a blocking IO (e.g. waiting for server reply).
              • SystemException — capture system problems, usually, server side problems.
                • UnsupportedOperationException — e.g. calling setReplication(..) on LocalFileSystem
                  • ReadOnlyFileSystemException — e.g. calling create(..) on HftpFileSystem
                • SystemNotReadyException — e.g. NameNode/JobTracker is starting up
                • QuotaExceededException
                • OutOfSpaceException
                • UncaughtRuntimeException — server side uncaught RuntimeException
              • FileSystemOperationException
                • AccessControlException
                • InvalidPathNameException
                • FileNotFoundException
                • FileAlreadyExistsException
                • DirectoryNotEmptyException
                • NotDirectoryException
                • IsDirectoryException
          Show
          Tsz Wo Nicholas Sze added a comment - The term FileSystemException is not clear whether it is a system problem or a file system operation problem. A better name may be SystemException. It also allow other systems like JobTracker to use it. It may be better to group FileNotFoundException, FileAlreadyExistsException, etc. How about extending FileSystemOperationException? I propose to have the following hierarchy: java.io.IOException java.rmi.RemoteException — network/ipc related exception java.io.InterruptedIOException — the client thread is interrupted when it is waiting for a blocking IO (e.g. waiting for server reply). SystemException — capture system problems, usually, server side problems. UnsupportedOperationException — e.g. calling setReplication(..) on LocalFileSystem ReadOnlyFileSystemException — e.g. calling create(..) on HftpFileSystem SystemNotReadyException — e.g. NameNode/JobTracker is starting up QuotaExceededException OutOfSpaceException UncaughtRuntimeException — server side uncaught RuntimeException FileSystemOperationException AccessControlException InvalidPathNameException FileNotFoundException FileAlreadyExistsException DirectoryNotEmptyException NotDirectoryException IsDirectoryException
          Hide
          Konstantin Shvachko added a comment -

          >> Should QuotaExceededException be a subclass of IOException, same as AccessDeniedException?
          > All the exceptions except Interrupted exceptions are subclasses of IOException.

          I meant QuotaExceededException should be a direct subclass of IOException as opposed to be a subclass of FileSystemException.

          > I disagree here. We should unwrap only the ones that are declared in the method signature.
          > RPC/network exceptions should be thrown as RemoteException

          I also considered RemoteException as a container for passing exceptions through RPC. If the name-node throws an IOException the DFSClient should unwrap it and throw the original directly rather than wrapped into the RemoteException.

          Show
          Konstantin Shvachko added a comment - >> Should QuotaExceededException be a subclass of IOException, same as AccessDeniedException? > All the exceptions except Interrupted exceptions are subclasses of IOException. I meant QuotaExceededException should be a direct subclass of IOException as opposed to be a subclass of FileSystemException. > I disagree here. We should unwrap only the ones that are declared in the method signature. > RPC/network exceptions should be thrown as RemoteException I also considered RemoteException as a container for passing exceptions through RPC. If the name-node throws an IOException the DFSClient should unwrap it and throw the original directly rather than wrapped into the RemoteException.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          >> Throw java.io.InterruptedIOException instead of InterruptedException.
          > We thought of InterruptedException to capture interrupts to any thread and not just threads busy with IO.

          Do you mean that a server thread is interrupted by some reason and then the server side InterruptedException is forwarded to the client? This sounds like a system problem.

          I was considering the case that the a client is waiting for a blocking IO (e.g. waiting for server reply) and the client thread is interrupted. In such case, InterruptedIOException should be used.

          Show
          Tsz Wo Nicholas Sze added a comment - >> Throw java.io.InterruptedIOException instead of InterruptedException. > We thought of InterruptedException to capture interrupts to any thread and not just threads busy with IO. Do you mean that a server thread is interrupted by some reason and then the server side InterruptedException is forwarded to the client? This sounds like a system problem. I was considering the case that the a client is waiting for a blocking IO (e.g. waiting for server reply) and the client thread is interrupted. In such case, InterruptedIOException should be used.
          Hide
          Sanjay Radia added a comment -

          >> Do you plan to unwrap ALL RemoteExceptions in DFSClient?
          >Yes all these exceptions will be unwrapped. But that implementation is not part of this jira.

          I disagree here. We should unwrap only the ones that are declared in the method signature.
          RPC/network exceptions should be thrown as RemoteException
          All others can be thrown as a FileSystemException with the details rewrapped. Further a well implemented NN should catch all
          non-declared exceptions on the server side and throw them as FileSystem exceptions (after logging them).
          DFSClient cannot assume that the NN is correctly implemented and hence also has to handle undeclared exceptions and throw
          FileSystemException.

          Show
          Sanjay Radia added a comment - >> Do you plan to unwrap ALL RemoteExceptions in DFSClient? >Yes all these exceptions will be unwrapped. But that implementation is not part of this jira. I disagree here. We should unwrap only the ones that are declared in the method signature. RPC/network exceptions should be thrown as RemoteException All others can be thrown as a FileSystemException with the details rewrapped. Further a well implemented NN should catch all non-declared exceptions on the server side and throw them as FileSystem exceptions (after logging them). DFSClient cannot assume that the NN is correctly implemented and hence also has to handle undeclared exceptions and throw FileSystemException.
          Hide
          Sanjay Radia added a comment -

          BTW this jira should have been a hadoop-common and not hadoop-hdfs jira.

          Show
          Sanjay Radia added a comment - BTW this jira should have been a hadoop-common and not hadoop-hdfs jira.
          Hide
          Sanjay Radia added a comment -

          >Can you put forth the use case why we have two different sub-types of exceptions: IOException and FileSystemException?

          I can see a case for exceptions that are "system problems", that is, not caused by an invalid parameter supplied by the
          caller. For example, the file system is full, or it is in safe mode. Generally the caller cannot do much about this and hence it is
          worth grouping such exceptions. Today your application may run fine and tomorrow the very same application may fail because of a "system problem". This is not bug in the application. So FileSystemExceptions are not application errors but system errors.(It is worth considering making these runtime exception and not checked exceptions; but this is orthogonal to the grouping question.)

          What else should fall into this group?
          Lack of quotas falls into the same category since the admin sets the quotas and the user does has not much control over this.

          Should access-control exceptions should be in the same group? I think not because this exception is mostly, but not always, caused by
          an application error.

          Show
          Sanjay Radia added a comment - >Can you put forth the use case why we have two different sub-types of exceptions: IOException and FileSystemException? I can see a case for exceptions that are "system problems", that is, not caused by an invalid parameter supplied by the caller. For example, the file system is full, or it is in safe mode. Generally the caller cannot do much about this and hence it is worth grouping such exceptions. Today your application may run fine and tomorrow the very same application may fail because of a "system problem". This is not bug in the application. So FileSystemExceptions are not application errors but system errors.(It is worth considering making these runtime exception and not checked exceptions; but this is orthogonal to the grouping question.) What else should fall into this group? Lack of quotas falls into the same category since the admin sets the quotas and the user does has not much control over this. Should access-control exceptions should be in the same group? I think not because this exception is mostly, but not always, caused by an application error.
          Hide
          Jitendra Nath Pandey added a comment -

          > Is FileSystemException a subclass of IOException?
          Yes FileSystemException is a subclass of IOException.

          > Add UnsupportedOperationException as a subclass of FileSystemException.
          I agree to add this one.

          > Throw java.io.InterruptedIOException instead of InterruptedException.
          We thought of InterruptedException to capture interrupts to any thread and not just threads busy with IO.

          > why we have two different sub-types of exceptions?

          FileSystemException is thrown when FS is in such a state that the requested operation cannot proceed for example if filesystem is in safemode, we can throw FileSystemNotReadyException which is a subtype of FileSystemException. The other IOExceptions are thrown when there are inconsistencies in the user inputs and operation cannot succees for example creating a file that already exists.

          > If it is java.nio.file.FileSystemException then it will appear only in java 7.
          FileSystemException is inspired by java.nio.file.FileSystemException but we will implement it because it is not in java 6.

          > What is IsDirectoryException? Sounds more like a method name.
          IsDirectoryException is inspired by Unix error code EISDIR, which means that expected input was a file but a directory is being passed. We can rename it to DirectoryNotAllowedException.

          > Should QuotaExceededException be a subclass of IOException, same as AccessDeniedException?
          All the exceptions except Interrupted exceptions are subclasses of IOException.

          > Do you plan to unwrap ALL RemoteExceptions in DFSClient?
          Yes all these exceptions will be unwrapped. But that implementation is not part of this jira.

          Show
          Jitendra Nath Pandey added a comment - > Is FileSystemException a subclass of IOException? Yes FileSystemException is a subclass of IOException. > Add UnsupportedOperationException as a subclass of FileSystemException. I agree to add this one. > Throw java.io.InterruptedIOException instead of InterruptedException. We thought of InterruptedException to capture interrupts to any thread and not just threads busy with IO. > why we have two different sub-types of exceptions? FileSystemException is thrown when FS is in such a state that the requested operation cannot proceed for example if filesystem is in safemode, we can throw FileSystemNotReadyException which is a subtype of FileSystemException. The other IOExceptions are thrown when there are inconsistencies in the user inputs and operation cannot succees for example creating a file that already exists. > If it is java.nio.file.FileSystemException then it will appear only in java 7. FileSystemException is inspired by java.nio.file.FileSystemException but we will implement it because it is not in java 6. > What is IsDirectoryException? Sounds more like a method name. IsDirectoryException is inspired by Unix error code EISDIR, which means that expected input was a file but a directory is being passed. We can rename it to DirectoryNotAllowedException. > Should QuotaExceededException be a subclass of IOException, same as AccessDeniedException? All the exceptions except Interrupted exceptions are subclasses of IOException. > Do you plan to unwrap ALL RemoteExceptions in DFSClient? Yes all these exceptions will be unwrapped. But that implementation is not part of this jira.
          Hide
          Konstantin Shvachko added a comment -
          1. What is FileSystemException? If it is java.nio.file.FileSystemException then it will appear only in java 7.
          2. What is IsDirectoryException? Sounds more like a method name.
          3. Should QuotaExceededException be a subclass of IOException, same as AccessDeniedException?
          4. Do you plan to unwrap ALL RemoteExceptions in DFSClient? Currently we unwrap only a few exception, while others are still thrown as RemoteException. I think we should not see RemoteException anywhere in FileContext or AbstractFS.
          Show
          Konstantin Shvachko added a comment - What is FileSystemException? If it is java.nio.file.FileSystemException then it will appear only in java 7. What is IsDirectoryException? Sounds more like a method name. Should QuotaExceededException be a subclass of IOException, same as AccessDeniedException? Do you plan to unwrap ALL RemoteExceptions in DFSClient? Currently we unwrap only a few exception, while others are still thrown as RemoteException. I think we should not see RemoteException anywhere in FileContext or AbstractFS.
          Hide
          dhruba borthakur added a comment -

          Can you put forth the use case why we have two different sub-types of exceptions: IOException and FileSystemException?

          For example, can you pl explain why QuotaExceededException is a FileSystemException?

          Show
          dhruba borthakur added a comment - Can you put forth the use case why we have two different sub-types of exceptions: IOException and FileSystemException? For example, can you pl explain why QuotaExceededException is a FileSystemException?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Sound good.

          Some comments:

          • Throw java.io.InterruptedIOException instead of InterruptedException.
          • Add UnsupportedOperationException as a subclass of FileSystemException.
          • Is FileSystemException a subclass of IOException?
          Show
          Tsz Wo Nicholas Sze added a comment - Sound good. Some comments: Throw java.io.InterruptedIOException instead of InterruptedException. Add UnsupportedOperationException as a subclass of FileSystemException. Is FileSystemException a subclass of IOException?

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Jitendra Nath Pandey
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development