Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2404

JMX Reporter broken using spark package

    XMLWordPrintableJSON

Details

    Description

      I am trying to enable JMX monitoring when using Hudi with Spark.

      I followed the Spark Quickstart tutorial https://hudi.apache.org/docs/quick-start-guide and when I added the options to enable JMX monitoring in the insert data stage https://hudi.apache.org/docs/quick-start-guide#insert-data, I get the following error: `java.lang.NoClassDefFoundError: org/apache/hudi/com/codahale/metrics/jmx/JmxReporter`

       

      code used:

      df.write.format("hudi").

        options(getQuickstartWriteConfigs).

        option("hoodie.metrics.on", "true").

        option("hoodie.metrics.reporter.type", "JMX").

        option("hoodie.metrics.reporter.port", "9889").

        option(PRECOMBINE_FIELD_OPT_KEY, "ts").

        option(RECORDKEY_FIELD_OPT_KEY, "uuid").

        option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").

        option(TABLE_NAME, tableName).

        mode(Append).

        save(basePath)

       

       

      metrics-jmx is a new package (not included in metrics-core: https://metrics.dropwizard.io/4.2.0/about/release-notes.html), so I believe it should be included in the pom.xml here: https://github.com/apache/hudi/blob/master/packaging/hudi-spark-bundle/pom.xml#L92-L93. When I updated my local pom.xml to include metrics-jmx and repeated the setup I was able to successfully enable JMX. The issue seems to happen for flink as well. I can submit a PR if that is easier, too. 

      Attachments

        Issue Links

          Activity

            People

              shivnarayan sivabalan narayanan
              sarahwitt Sarah Witt
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 1h
                  1h
                  Remaining:
                  Remaining Estimate - 1h
                  1h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified