Hadoop Common
  1. Hadoop Common
  2. HADOOP-6909

Umbrella ticket for reducing the number of libraries in Common

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      There seems to be some consensus that moving libraries out of Common, with the eventual goal of getting rid of the Common package all together, is a good idea: http://hadoop.markmail.org/thread/jhiactdekhywznac. Hopefully we can collect the incremental tasks needed to achieve that goal here.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          1450d 32m 1 Allen Wittenauer 30/Jul/14 18:12
          Allen Wittenauer made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Incomplete [ 4 ]
          Hide
          Allen Wittenauer added a comment -

          Resolved? Won't Fix?

          You decide.

          But with the 3 or 4 project splits, mavenization, etc, etc, that's happened since this was filed, it isn't particularly relevant anymore.

          Show
          Allen Wittenauer added a comment - Resolved? Won't Fix? You decide. But with the 3 or 4 project splits, mavenization, etc, etc, that's happened since this was filed, it isn't particularly relevant anymore.
          Hide
          Ranjit Mathew added a comment -

          I think it makes sense to move fs APIs like FileSystem and FileContext into HDFS. The same way as mapreduce APIs like Mapper and Reducer are in mapreduce project.

          I believe HDFS is a file-system supported by Hadoop and it could potentially end up supporting others (e.g Ceph via HADOOP-6253).
          So it makes sense to have a common FS API that then has a concrete implementation in HDFS.

          Show
          Ranjit Mathew added a comment - I think it makes sense to move fs APIs like FileSystem and FileContext into HDFS. The same way as mapreduce APIs like Mapper and Reducer are in mapreduce project. I believe HDFS is a file-system supported by Hadoop and it could potentially end up supporting others (e.g Ceph via HADOOP-6253 ). So it makes sense to have a common FS API that then has a concrete implementation in HDFS.
          Hide
          Konstantin Shvachko added a comment -
          • Getting rid of common, as Dough proposed, is a good idea, imo.
          • I think it makes sense to move fs APIs like FileSystem and FileContext into HDFS. The same way as mapreduce APIs like Mapper and Reducer are in mapreduce project.
          • I looked at commons.vfs packages. There are clearly some similarities with fs.FileSystem, but the differences are substantial. I believe one can implement commons.vfs.HDFS using hadoop.fs.FileSystem. But using commons.vfs as a new API for HDFS would be a big, incompatible, hard to justify change, which also lacks intrinsic for HDFS apis like getBlockLocations().
            So at this point I don't understand what the proposal of moving fs packages into Jakarta Commons means.
          Show
          Konstantin Shvachko added a comment - Getting rid of common, as Dough proposed, is a good idea, imo. I think it makes sense to move fs APIs like FileSystem and FileContext into HDFS. The same way as mapreduce APIs like Mapper and Reducer are in mapreduce project. I looked at commons.vfs packages. There are clearly some similarities with fs.FileSystem, but the differences are substantial. I believe one can implement commons.vfs.HDFS using hadoop.fs.FileSystem . But using commons.vfs as a new API for HDFS would be a big, incompatible, hard to justify change, which also lacks intrinsic for HDFS apis like getBlockLocations(). So at this point I don't understand what the proposal of moving fs packages into Jakarta Commons means.
          Hide
          Owen O'Malley added a comment -

          Moving FileSystem and FileContext to Jakarta is a non-starter. Having it in Common where we are all committers is hard enough.

          We've discussed moving FileSystem to HDFS, but that is both problematic (it moves a lot of non-HDFS stuff into HDFS) and breaks the abstraction that HDFS implements the contract required by FileSystem.

          I don't think this is a good direction.

          Show
          Owen O'Malley added a comment - Moving FileSystem and FileContext to Jakarta is a non-starter. Having it in Common where we are all committers is hard enough. We've discussed moving FileSystem to HDFS, but that is both problematic (it moves a lot of non-HDFS stuff into HDFS) and breaks the abstraction that HDFS implements the contract required by FileSystem. I don't think this is a good direction.
          Hide
          Arun C Murthy added a comment -

          Jay, can we please use HADOOP-6910 for discussions on configuration? Thanks.

          Show
          Arun C Murthy added a comment - Jay, can we please use HADOOP-6910 for discussions on configuration? Thanks.
          Hide
          Jay Booth added a comment -

          Well, if we're encourage to derail this thread with config-related stuff, what do people think about a variable-substituted properties file instead of XML?

          #comment
          key=value
          #other comment
          key2=$

          {key}
          Show
          Jay Booth added a comment - Well, if we're encourage to derail this thread with config-related stuff, what do people think about a variable-substituted properties file instead of XML? #comment key=value #other comment key2=$ {key}
          Hide
          Jeff Hammerbacher added a comment -

          Also, I apologize if I jumped the gun a little early, I became irrationally exuberant when I thought about moving to a generic configuration system.

          Show
          Jeff Hammerbacher added a comment - Also, I apologize if I jumped the gun a little early, I became irrationally exuberant when I thought about moving to a generic configuration system.
          Hide
          Arun C Murthy added a comment -

          Yep.

          Show
          Arun C Murthy added a comment - Yep.
          Hide
          Jeff Hammerbacher added a comment -

          I'm sure I'd call it 'consensus' yet, we need a discussion before we reach consensus.

          Sure, and hopefully that discussion can take place here, to avoid derailing the thread on combining the committer lists.

          Show
          Jeff Hammerbacher added a comment - I'm sure I'd call it 'consensus' yet, we need a discussion before we reach consensus. Sure, and hopefully that discussion can take place here, to avoid derailing the thread on combining the committer lists.
          Hide
          Arun C Murthy added a comment -

          There seems to be some consensus that moving libraries out of Common

          I'm sure I'd call it 'consensus' yet, we need a discussion before we reach consensus.

          Show
          Arun C Murthy added a comment - There seems to be some consensus that moving libraries out of Common I'm sure I'd call it 'consensus' yet, we need a discussion before we reach consensus.
          Hide
          Arun C Murthy added a comment -

          Moving fs to HDFS might be reasonable, but moving metrics to Commons means we will have another dependency, plus it isn't clear metrics is usable in a wider context.

          Show
          Arun C Murthy added a comment - Moving fs to HDFS might be reasonable, but moving metrics to Commons means we will have another dependency, plus it isn't clear metrics is usable in a wider context.
          Hide
          Jeff Hammerbacher added a comment -

          Doug also mentions on that thread that the metrics and fs packages might also be moved to Jakarta Commons.

          Show
          Jeff Hammerbacher added a comment - Doug also mentions on that thread that the metrics and fs packages might also be moved to Jakarta Commons.
          Jeff Hammerbacher made changes -
          Link This issue is related to HADOOP-6910 [ HADOOP-6910 ]
          Jeff Hammerbacher made changes -
          Field Original Value New Value
          Link This issue is related to HADOOP-6659 [ HADOOP-6659 ]
          Jeff Hammerbacher created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Jeff Hammerbacher
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development