Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-10666

PrometheusReportingTask does not use UTF-8 encoding on /metrics/ endpoint

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.17.0, 1.16.3, 1.18.0, 1.23.2, 1.25.0, 2.0.0-M2
    • None
    • Extensions
    • JVM with non-UTF-8 default encoding (e.g. default Windows installation)

    Description

      We have created a default PrometheusReportingTask for our NiFi instance and tried to consume the metrics with Prometheus. However, Prometheus threw the following error:

      ts=2022-10-19T12:25:18.110Z caller=scrape.go:1332 level=debug component="scrape manager" scrape_pool=nifi-cluster target=http://***nifi***:9092/metrics msg="Append failed" err="invalid UTF-8 label value" 

      Upon further inspection, we noticed that the /metrics/ endpoint exposed by the reporting task does not use UTF-8 encoding, which is required by Prometheus (as documented here: [Exposition formats | Prometheus|https://prometheus.io/docs/instrumenting/exposition_formats/]).

      Our flow uses non-ASCII characters (in our case German umlauts like "ü"). As a workaround, removing those characters fixes the Prometheus error, but this is not practical for a large flow in German language.

      Opening the /metrics/ endpoint in a browser confirms that the encoding used is not UTF-8:

      > document.characterSet
      'windows-1252' 

      The responsible code might be here:

      https://github.com/apache/nifi/blob/2be5c26f287469f4f19f0fa759d6c1b56dc0e348/nifi-nar-bundles/nifi-prometheus-bundle/nifi-prometheus-reporting-task/src/main/java/org/apache/nifi/reporting/prometheus/PrometheusServer.java#L67

      The PrometheusServer used by the reporting task uses an OutputStreamWriter with the default encoding, instead of explicitly using UTF-8. The Content-Type header set in that function also does not get passed along (see screenshot).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            Rene_Z René Zeidler

            Dates

              Created:
              Updated:

              Slack

                Issue deployment