Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-11045

Replace deprecated method FileSystem#createNonRecursive

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      This change affect directly ProtobufLogWriter#init() associated to TestHLog#testFailedToCreateHLogIfParentRenamed.

        Issue Links

          Activity

          Hide
          enis Enis Soztutar added a comment -

          HBase is supported (to the extend that corresponding vendors support it) on a couple of file systems other than HDFS: gpfs, maprfs, EMC Isilon, Microsoft WASB are the ones from the top of my head. You should contact the corresponding vendor if you want to learn more.

          Show
          enis Enis Soztutar added a comment - HBase is supported (to the extend that corresponding vendors support it) on a couple of file systems other than HDFS: gpfs, maprfs, EMC Isilon, Microsoft WASB are the ones from the top of my head. You should contact the corresponding vendor if you want to learn more.
          Hide
          ryantotti RyanTotti added a comment -

          Does that mean I can only use HDFS for HBase? Are there any other choices Hadoop supported?

          Show
          ryantotti RyanTotti added a comment - Does that mean I can only use HDFS for HBase? Are there any other choices Hadoop supported?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          BTW, given that HDFS has un-deprecated createNonRecursive(), what about closing this as a WONTFIX?

          Show
          stevel@apache.org Steve Loughran added a comment - BTW, given that HDFS has un-deprecated createNonRecursive() , what about closing this as a WONTFIX?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          you cannot use swift instead of HDFS. It isn't a real filesystem and things will fail dramatically —even if this method was implemented, there are too many other differences. The fact that your attempt is failing this early on, while frustrating, stops you getting deeper into trouble. Sorry. Note that you can't use S3 either, same problem.

          see: [Object stores vs filesystems](http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/introduction.html).

          Show
          stevel@apache.org Steve Loughran added a comment - you cannot use swift instead of HDFS. It isn't a real filesystem and things will fail dramatically —even if this method was implemented, there are too many other differences. The fact that your attempt is failing this early on, while frustrating, stops you getting deeper into trouble. Sorry. Note that you can't use S3 either, same problem. see: [Object stores vs filesystems] ( http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/introduction.html ).
          Hide
          ryantotti RyanTotti added a comment -

          I use Swift instead of HDFS as storage for HBase,HBase RegionServer can not start. HBase use the deprecated method createNonRecursive() in org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init().

          Show
          ryantotti RyanTotti added a comment - I use Swift instead of HDFS as storage for HBase,HBase RegionServer can not start. HBase use the deprecated method createNonRecursive() in org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init().
          Hide
          enis Enis Soztutar added a comment -

          Personally, I wouldn't have had create() create any parent directories at all, leave it to the responsibility of the caller, but for reasons of history, that's not the case...

          Agreed. create() should not auto-create the parents. This is very unintuitive. However, at this point we cannot change it I seems. Having an explicit argument to create would be good though.

          Show
          enis Enis Soztutar added a comment - Personally, I wouldn't have had create() create any parent directories at all, leave it to the responsibility of the caller, but for reasons of history, that's not the case... Agreed. create() should not auto-create the parents. This is very unintuitive. However, at this point we cannot change it I seems. Having an explicit argument to create would be good though.
          Hide
          enis Enis Soztutar added a comment -

          FileSystem#createNonRecursive() isn't implemented by many filesystems, using it would run the risk of hitting implementations that don't.

          HBase does not have to work on every file system. We should be ok to list a support for atomic createNonRecursive() as a requirement from the underlying FS for HBase.

          Is there any barrier to using a check for the parent dir existing before calling create?

          I am afraid no. HBase needs this to be atomic in the namespace, since we use it for fencing. Similar to atomic renames, it is a requirement from FS, and I don't think this is unrealistic. FS owns the namespace, and it should be able to provide this semantics.
          We rely on an atomic directory rename + createNonRecursive() to ensure that the region server who is dead cannot create more WAL files and commit new data or fencing. Otherwise, there will be data loss. In an non-atomic implementation, there will be a race condition, where we rename the directory, but the fenced out server will still be able to create a new file there and continue committing data which will then be lost.

          Show
          enis Enis Soztutar added a comment - FileSystem#createNonRecursive() isn't implemented by many filesystems, using it would run the risk of hitting implementations that don't. HBase does not have to work on every file system. We should be ok to list a support for atomic createNonRecursive() as a requirement from the underlying FS for HBase. Is there any barrier to using a check for the parent dir existing before calling create? I am afraid no. HBase needs this to be atomic in the namespace, since we use it for fencing. Similar to atomic renames, it is a requirement from FS, and I don't think this is unrealistic. FS owns the namespace, and it should be able to provide this semantics. We rely on an atomic directory rename + createNonRecursive() to ensure that the region server who is dead cannot create more WAL files and commit new data or fencing. Otherwise, there will be data loss. In an non-atomic implementation, there will be a race condition, where we rename the directory, but the fenced out server will still be able to create a new file there and continue committing data which will then be lost.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Personally, I wouldn't have had create() create any parent directories at all, leave it to the responsibility of the caller, but for reasons of history, that's not the case...

          Show
          stevel@apache.org Steve Loughran added a comment - Personally, I wouldn't have had create() create any parent directories at all, leave it to the responsibility of the caller, but for reasons of history, that's not the case...
          Hide
          gustavoanatoly Gustavo Anatoly added a comment -

          Hi, Steve. Sorry for delay.

          About your question:

          Is there any barrier to using a check for the parent dir existing before calling create?

          I think there are no problem, but the validations to check if the parent path doesn't exist more is responsibility of Hadoop. So your idea is good if encapsulated into HFileSystem#createNonRecursive():

          if (!fs.exists(f.getParent())) {
                String exceptionMsg = "Path doesn't exist: " + f.getParent().toString();
                LOG.error(exceptionMsg);
                throw new IOException(exceptionMsg);
              }
              return fs.create(f, overwrite, bufferSize, replication, blockSize, progress);
          

          And following the adjustments, ProtobufLogWriter#init() line 79:

          ...
          HFileSystem hFS = new HFileSystem(fs);
          output = hFS.createNonRecursive(path, overwritable, bufferSize, replication, blockSize, null);
          output.write(ProtobufLogReader.PB_WAL_MAGIC);
          ...
          

          But I prefer put this validations on Hadoop But what do you think? Steve Loughran Ted Yu [~enis-2]

          Show
          gustavoanatoly Gustavo Anatoly added a comment - Hi, Steve. Sorry for delay. About your question: Is there any barrier to using a check for the parent dir existing before calling create? I think there are no problem, but the validations to check if the parent path doesn't exist more is responsibility of Hadoop. So your idea is good if encapsulated into HFileSystem#createNonRecursive(): if (!fs.exists(f.getParent())) { String exceptionMsg = "Path doesn't exist: " + f.getParent().toString(); LOG.error(exceptionMsg); throw new IOException(exceptionMsg); } return fs.create(f, overwrite, bufferSize, replication, blockSize, progress); And following the adjustments, ProtobufLogWriter#init() line 79: ... HFileSystem hFS = new HFileSystem(fs); output = hFS.createNonRecursive(path, overwritable, bufferSize, replication, blockSize, null ); output.write(ProtobufLogReader.PB_WAL_MAGIC); ... But I prefer put this validations on Hadoop But what do you think? Steve Loughran Ted Yu [~enis-2]
          Hide
          stevel@apache.org Steve Loughran added a comment -

          FileSystem#createNonRecursive() isn't implemented by many filesystems, using it would run the risk of hitting implementations that don't.

          Is there any barrier to using a check for the parent dir existing before calling create? That's essentially what most filesystems would end up doing

          createNoRecursive(Filesystem fs, Path p) {
            if (!fs.exists(p.parent()) throw FileNotFoundException(p.parent())
            fs.create(p)
          }
          

          It's not atomic, but if you look closely at the source, it's not atomic in most FS implementations anyway, including native (mkdirs() isn't atomic there).

          Show
          stevel@apache.org Steve Loughran added a comment - FileSystem#createNonRecursive() isn't implemented by many filesystems, using it would run the risk of hitting implementations that don't. Is there any barrier to using a check for the parent dir existing before calling create? That's essentially what most filesystems would end up doing createNoRecursive(Filesystem fs, Path p) { if (!fs.exists(p.parent()) throw FileNotFoundException(p.parent()) fs.create(p) } It's not atomic, but if you look closely at the source, it's not atomic in most FS implementations anyway, including native (mkdirs() isn't atomic there).
          Hide
          enis Enis Soztutar added a comment -

          Ted Yu can you please like the Hdfs issue here.

          Show
          enis Enis Soztutar added a comment - Ted Yu can you please like the Hdfs issue here.
          Hide
          gustavoanatoly Gustavo Anatoly added a comment -

          Thanks Enis.

          You have any suggestion? Or this issue should be closed, for while?

          Show
          gustavoanatoly Gustavo Anatoly added a comment - Thanks Enis. You have any suggestion? Or this issue should be closed, for while?
          Hide
          enis Enis Soztutar added a comment -

          Any more context on this?

          FileSystem#createNonRecursive() is deprecated, but Hadoop unfortunately fails to provide a viable alternative.The FileContext API which is supposed to replace the FileSystem API is not mature yet for us to switch.
          createNonRecursive() is extremely important to do fencing of dead region servers properly.

          Show
          enis Enis Soztutar added a comment - Any more context on this? FileSystem#createNonRecursive() is deprecated, but Hadoop unfortunately fails to provide a viable alternative.The FileContext API which is supposed to replace the FileSystem API is not mature yet for us to switch. createNonRecursive() is extremely important to do fencing of dead region servers properly.

            People

            • Assignee:
              gustavoanatoly Gustavo Anatoly
              Reporter:
              gustavoanatoly Gustavo Anatoly
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development