Description
I ran the run_sync_tool.sh script after git cloning and building a new instance of apache-hudi (branch: release-0.12.0). The script failed with classpath related errors. Find below the relevant sequence of commands I used:
$ git branch
- (HEAD detached at release-0.12.0)
$ mvn -Dspark3.2 -Dscala-2.12 -DskipTests -Dcheckstyle.skip -Drat.skip clean install
$ echo $HADOOP_HOME
/home/pramod/2installers/hadoop-2.7.4
$ echo $HIVE_HOME
/home/pramod/2installers/apache-hive-3.1.3-bin
$ /run_sync_tool.sh --jdbc-url jdbc:hive2:\/\/hiveserver:10000 --partitioned-by bucket --base-path /2-pramod/tmp/gcs-integration-test/data/meta-gcs --database default --table gcs_meta_hive_4 > log.out 2>&1
setting hadoop conf dir
Running Command : java -cp /home/pramod/2installers/apache-hive-3.1.3-bin/lib/hive-metastore-3.1.3.jar::/home/pramod/2installers/apache-hive-3.1.3-bin/lib/hive-service-3.1.3.jar::/home/pramod/2installers/apache-hive-3.1.3-bin/lib/hive-exec-3.1.3.jar::/home/pramod/2installers/apache-hive-3.1.3-bin/lib/hive-jdbc-3.1.3.jar:/home/pramod/2installers/apache-hive-3.1.3-bin/lib/hive-jdbc-handler-3.1.3.jar::/home/pramod/2installers/apache-hive-3.1.3-bin/lib/jackson-annotations-2.12.0.jar:/home/pramod/2installers/apache-hive-3.1.3-bin/lib/jackson-core-2.12.0.jar:/home/pramod/2installers/apache-hive-3.1.3-bin/lib/jackson-core-asl-1.9.13.jar:/home/pramod/2installers/apache-hive-3.1.3-bin/lib/jackson-databind-2.12.0.jar:/home/pramod/2installers/apache-hive-3.1.3-bin/lib/jackson-dataformat-smile-2.12.0.jar:/home/pramod/2installers/apache-hive-3.1.3-bin/lib/jackson-mapper-asl-1.9.13.jar:/home/pramod/2installers/apache-hive-3.1.3-bin/lib/jackson-module-scala_2.11-2.12.0.jar::/home/pramod/2installers/hadoop-2.7.4/share/hadoop/common/:/home/pramod/2installers/hadoop-2.7.4/share/hadoop/mapreduce/:/home/pramod/2installers/hadoop-2.7.4/share/hadoop/hdfs/:/home/pramod/2installers/hadoop-2.7.4/share/hadoop/common/lib/:/home/pramod/2installers/hadoop-2.7.4/share/hadoop/hdfs/lib/*:/home/pramod/2installers/hadoop-2.7.4/etc/hadoop:/3-pramod/3workspace/apache-hudi/hudi-sync/hudi-hive-sync/../../packaging/hudi-hive-sync-bundle/target/hudi-hive-sync-bundle-0.12.0.jar org.apache.hudi.hive.HiveSyncTool --jdbc-url jdbc:hive2://hiveserver:10000 --partitioned-by bucket --base-path /2-pramod/tmp/gcs-integration-test/data/meta-gcs --database default --table gcs_meta_hive_4
2022-09-08 10:53:24,335 INFO [main] conf.HiveConf (HiveConf.java:findConfigFile(187)) - Found configuration file file:/home/pramod/2installers/apache-hive-3.1.3-bin/conf/hive-site.xml
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/2-pramod/installers/hadoop-2.7.4/share/hadoop/common/lib/hadoop-auth-2.7.4.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2022-09-08 10:53:25,876 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2022-09-08 10:53:26,359 INFO [main] table.HoodieTableMetaClient (HoodieTableMetaClient.java:<init>(121)) - Loading HoodieTableMetaClient from /2-pramod/tmp/gcs-integration-test/data/meta-gcs
2022-09-08 10:53:26,568 INFO [main] table.HoodieTableConfig (HoodieTableConfig.java:<init>(243)) - Loading table properties from /2-pramod/tmp/gcs-integration-test/data/meta-gcs/.hoodie/hoodie.properties
2022-09-08 10:53:26,585 INFO [main] table.HoodieTableMetaClient (HoodieTableMetaClient.java:<init>(140)) - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from /2-pramod/tmp/gcs-integration-test/data/meta-gcs
2022-09-08 10:53:26,586 INFO [main] table.HoodieTableMetaClient (HoodieTableMetaClient.java:<init>(143)) - Loading Active commit timeline for /2-pramod/tmp/gcs-integration-test/data/meta-gcs
2022-09-08 10:53:26,727 INFO [main] timeline.HoodieActiveTimeline (HoodieActiveTimeline.java:<init>(129)) - Loaded instants upto : Option{val=[20220907220948700__commit__COMPLETED]}
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/http/config/Lookup
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:677)
at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:228)
at org.apache.hudi.hive.ddl.JDBCExecutor.createHiveConnection(JDBCExecutor.java:104)
at org.apache.hudi.hive.ddl.JDBCExecutor.<init>(JDBCExecutor.java:59)
at org.apache.hudi.hive.Hoodimodified_run_sync_tool.sheHiveSyncClient.<init>(HoodieHiveSyncClient.java:91)
at org.apache.hudi.hive.HiveSyncTool.initSyncClient(HiveSyncTool.java:101)
at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:95)
at org.apache.hudi.hive.HiveSyncTool.main(HiveSyncTool.java:358)
Caused by: java.lang.ClassNotFoundException: org.apache.http.config.Lookup
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:582)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 9 more
I have also attached a modified version of the run_sync_tool.sh that made this script work (mainly adding jars from the build-classpath to the script, adding HIVE_CONF_DIR to classpath etc).