Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-7791

Add PutClickHouse Processor for Writing Large Streams

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      ClickHouse supports streaming a number of file formats directly using their JDBC (superset) library. Often times it's much more convenient to stream the contents of a file directly to ClickHouse, rather than bothering to process the data in NiFi and then using the native JDBC processor.

      One workaround is to just use PutHTTP to stream the file directly to ClickHouse using it's HTTP endpoint. However, this can get a bit tedious, especially if you need to pass credentials as part of the HTTP method call.

      I'm creating this Jira to support creating a simple PutClickHouse processor that can stream a FlowFile directly to ClickHouse with the following features

      • CSV, CSVWithNames, TSV and JSONEachRow
      • Ability to modify column name ordering
      • Custom delimiters for CSV and TSV
      • SSL support (with and without strict mode)
      • Multiple hosts (comma separated) to utilize the BalancedClickhouseDataSource
      • Username and Password

      I'm currently wrapping up a PR for this. I wrote it using Kotlin, which uses a processor-scope maven plugin. If there's enough objection, it can be rewritten in native Java.

      +joewitt since I spoke with him regarding this a while back.

      Attachments

        Activity

          People

            rickysaltzer Ricky Saltzer
            rickysaltzer Ricky Saltzer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: