Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-4496

Improve performance of CSVReader

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • Extensions
    • None

    Description

      During some throughput testing, it was noted that the CSVReader was not as fast as desired, processing less than 50k records per second. A look at this benchmark implies that the Apache Commons CSV parser (used by CSVReader) is quite slow compared to others.

      From that benchmark it appears that CSVReader could be enhanced by using a different CSV parser under the hood. Perhaps Jackson is the best choice, as it is fast when values are quoted, and is a mature and maintained codebase.

      Attachments

        Issue Links

          Activity

            People

              mattyb149 Matt Burgess
              mattyb149 Matt Burgess
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: