Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Done
-
None
-
None
-
None
Description
Since 3.2 Spark now uses hadoop-client-api and hadoop-client-runtime.
While we don't actually specify what HBase libraries are needed on the Spark client side for the connector, at least the Cloudera docs specify the classes provided by "hbase mapredcp"
which includes the full unshaded Hadoop JAR set.
Investigate whether hbase-shaded-client-byo-hadoop and the hbase-client-api and hbase-client-runtime is enough for the connector, and if yes, document how to set the Spark classpath.
Alternatively, if hbase-shaded-client-byo-hadoop is not enough, check if hbase-shaded-mapreduce plus the above two shaded Hadoop client JAR provides everything needed.
Attachments
Issue Links
- is depended upon by
-
HBASE-28214 Document Spark classpath requirements for the Spark connector
- Open
-
PHOENIX-7120 Investigate changes needed for Spark connectors shading if hbase-shaded-client-byo-hadoop is used
- Resolved
- is related to
-
SPARK-33618 hadoop-aws doesn't work
- Resolved
-
SPARK-33212 Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile
- Resolved