Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-4696

Support concurrent tasks in PutHiveStreaming

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • None
    • None

    Description

      Currently PutHiveStreaming (PHS) can only support a single task at a time. Before NIFI-4342, that meant each target table would need its own PHS instance, which can be cumbersome with large numbers of tables. After NIFI-4342, Expression Language could be used for SDLC purposes (database/table changes between development and production, e.g.).

      However it would be nice to be able to support at least database/table names using flow file attributes, and also to support multiple tasks to handle them concurrently. Due to the nature of PHS and the Streaming Ingest APIs (and implementation), it is likely not prudent to allow two tasks to write to the same table and partition at the same time.

      I propose adding flow file attribute EL evaluation where prudent, and allowing per-table concurrency in PHS. A thread will attempt to get a lock on a table, and if it cannot, will rollback and return.

      Attachments

        Issue Links

          Activity

            People

              mattyb149 Matt Burgess
              mattyb149 Matt Burgess
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: