Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-17098

Reduce Guava dependency in Hadoop source code

    XMLWordPrintableJSON

Details

    • Task
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Relying on Guava implementation in Hadoop has been painful due to compatibility and vulnerability issues.
      Guava updates tend to break/deprecate APIs. This made It hard to maintain backward compatibility within hadoop versions and clients/downstreams.

      With 3.x uses java8+, the java 8 features should preferred to Guava, reducing the footprint, and giving stability to source code.

      This jira should serve as an umbrella toward an incremental effort to reduce the usage of Guava in the source code and to create subtasks to replace Guava classes with Java features.

      Furthermore, it will be good to add a rule in the pre-commit build to warn against introducing a new Guava usage in certain modules.

      Any one willing to take part in this code refactoring has to:

      1. Focus on one module at a time in order to reduce the conflicts and the size of the patch. This will significantly help the reviewers.
      2. Run all the unit tests related to the module being affected by the change. It is critical to verify that any change will not break the unit tests, or cause a stable test case to become flaky.
      3. Merge should be done to the following branches:  trunk, branch-3.3, branch-3.2, branch-3.1

       

      A list of sub tasks replacing Guava APIs with java8 features:

      com.google.common.io.BaseEncoding#base64()	java.util.Base64
      com.google.common.io.BaseEncoding#base64Url()	java.util.Base64
      com.google.common.base.Joiner.on()	                        java.lang.String#join() or 
                                                                                               java.util.stream.Collectors#joining()
      com.google.common.base.Optional#of()	                java.util.Optional#of()
      com.google.common.base.Optional#absent()	        java.util.Optional#empty()
      com.google.common.base.Optional#fromNullable()	java.util.Optional#ofNullable()
      com.google.common.base.Optional	                        java.util.Optional
      com.google.common.base.Predicate	                        java.util.function.Predicate
      com.google.common.base.Function	                        java.util.function.Function
      com.google.common.base.Supplier	                        java.util.function.Supplier
      

       

      I also vote for the replacement of Precondition with either a wrapper, or Apache commons lang.

      I believe you guys have dealt with Guava compatibilities in the past and probably have better insights. Any thoughts? weichiu, gabor.bota, stevel@apache.org, ayushtkn, busbey, jeagles, kihwal

       

      Attachments

        Issue Links

          1.
          Replace Guava Predicate with Java8+ Predicate Sub-task Resolved Ahmed Hussein  
          2.
          Replace Guava Supplier with Java8+ Supplier in Hadoop Sub-task Resolved Ahmed Hussein  
          3.
          Replace Guava Supplier with Java8+ Supplier in MAPREDUCE Sub-task Resolved Ahmed Hussein  
          4.
          Replace Guava Supplier with Java8+ Supplier in hdfs Sub-task Resolved Ahmed Hussein  
          5.
          Replace Guava Function with Java8+ Function Sub-task Resolved Ahmed Hussein  
          6.
          Replace Guava Optional with Java8+ Optional Sub-task Resolved Ahmed Hussein  
          7.
          Update the checkstyle config to ban some guava functions Sub-task Resolved Akira Ajisaka

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          8.
          add guava BaseEncoding to illegalClasses Sub-task Resolved Ahmed Hussein

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          9.
          remove guava Preconditions from Hadoop-common-project modules Sub-task Resolved Ahmed Hussein

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 50m
          10.
          implement non-guava Precondition checkNotNull Sub-task Resolved Ahmed Hussein

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h 20m
          11.
          Replace Guava Preconditions to avoid Guava dependency Sub-task In Progress Ahmed Hussein  
          12.
          Add unguava implementation for Joiner in hadoop.StringUtils Sub-task Patch Available Ahmed Hussein  
          13.
          Implement wrapper for guava newArrayList and newLinkedList Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          14.
          Replace Guava initialization of Lists.newArrayList Sub-task Resolved Unassigned  
          15.
          Create Classes to wrap Guava code replacement Sub-task Open Unassigned  
          16.
          Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and hadoop-tools Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 8h 10m
          17.
          Replace Guava.Splitter with common.util.StringUtils Sub-task Open Unassigned  
          18.
          Add checkstyle rule to prevent further usage of Guava classes Sub-task Resolved Ahmed Hussein  
          19.
          Replace Guava Sets usage by Hadoop's own Sets in hadoop-hdfs-project Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 3h
          20.
          Replace Guava Sets usage by Hadoop's own Sets in hadoop-yarn-project Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 50m
          21.
          Replace Guava Sets usage by Hadoop's own Sets in hadoop-mapreduce-project Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          22.
          implement non-guava Precondition checkArgument Sub-task Resolved Ahmed Hussein

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 40m
          23.
          implement non-guava Precondition checkState Sub-task Resolved Ahmed Hussein

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          24.
          Provide alternative to Guava VisibleForTesting Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 6h 20m
          25.
          Replace Guava VisibleForTesting by Hadoop's own annotation in hadoop-common-project modules Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 5h 50m
          26.
          Replace Guava VisibleForTesting by Hadoop's own annotation in hadoop-hdfs-project modules Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h
          27.
          Replace Guava VisibleForTesting by Hadoop's own annotation in hadoop-cloud-storage-project and hadoop-mapreduce-project modules Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 40m
          28.
          hadoop-auth module cannot import non-guava implementation in hatoop util Sub-task Resolved Unassigned  
          29.
          Replace Guava VisibleForTesting by Hadoop's own annotation in hadoop-tools modules Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          30.
          Replace Guava VisibleForTesting by Hadoop's own annotation in hadoop-yarn-project modules Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          31.
          unguava: remove Preconditions from hdfs-projects module Sub-task Resolved Ahmed Hussein

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 20m
          32.
          unguava: remove Preconditions from hadoop-yarn-project modules Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 1h 10m
          33.
          unguava: remove Preconditions from hadoop-tools modules Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 50m
          34.
          Add restrict-imports-enforcer-rule for Guava Preconditions in hadoop-main pom Sub-task Resolved Viraj Jasani

          100%

          Original Estimate - Not Specified Original Estimate - Not Specified
          Time Spent - 4.5h

          Activity

            People

              ahussein Ahmed Hussein
              ahussein Ahmed Hussein
              Votes:
              1 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 49h 10m
                  49h 10m