Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.0.0
-
None
Description
there are contexts where we want to stay out of our downstream users way wrt dependencies, but they need more Hadoop classes than we provide. i.e. any downstream client that wants to use both HBase and HDFS in their application, or any non-MR YARN application.
Now that Hadoop also has shaded client artifacts for Hadoop 3, we're also providing less incremental benefit by including our own rewritten Hadoop classes to avoid downstream needing to pull in all of Hadoop's transitive dependencies.
right now those users need to ensure that any jars from the Hadoop project are loaded in the classpath prior to our shaded client jar. This is brittle and prone to weird debugging trouble.
instead, we should have two artifacts: one that just lists Hadoop as a prerequisite and one that still includes the rewritten-but-not-relocated Hadoop classes.
We can then use docs to emphasize when each of these is appropriate to use.